January 16, 2025

Mita Williams

In Praise of Bibliomancy

Copying. Mimesis. Autofiction. Bibliomancy. Copying. Mimesis. Autofiction. Bibliomancy. Copying. Mimesis. Autofiction. Bibliomancy. Borges.

by Mita Williams at January 16, 2025 09:42 PM

Jonathan Rochkind

Using CloudFlare Turnstile to protect certain pages on a Rails app

I work at a non-profit academic institution, on a site that manages, searches, and displays digitized historical materials: The Science History Institute Digital Collections.

Much of our stuff is public domain, and regardless we put this stuff on the web to be seen and used and shared. (Within the limits of copyright law and fair use; we are not the copyright holders of most of it). We have no general problem with people scraping our pages.

The problem is that, like many of us, our site is being overwhelmed with poorly behaved bots. Lately one of the biggest problems is with bots clicking on every possible combination of facet limits in our “faceted search” — this is not useful for them, and it overwhelms our site. “Search” pages are one of our most resource-constrained category of page in our present site, adding to the injury. Peers say even if we scaled up (auto or not) — the bots sometimes scale up to match anyway!

One option would be putting some kind of “Web Application Firewall” (WAF) in front of the whole app. Our particular combination of team and budget and platform (heroku) makes a lot of these options expensive for us in licensing, staff time to manage, or both. Another option is certainly putting the the whole thing behind (ostensibly free) CloudFlare CDN and using its built-in WAF, but we’d like to avoid giving our DNS over to CloudFlare, I’ve heard mixed reviews of CloudFlare free staying free, and generally am trying to avoid contributing to CloudFlare’s monopoly unaccountable control of the internet.`

Although ironically then, the solution we arrived at is still using CloudFlare, but Cloudflare’s Turnstile “captcha replacement”, one of those things that gives you the “check this box” or more often entirely interactive “checking if you are a bot” UXs.

[If you’re a tldr look at the code type, here’s the initial implementation PR in our open repo, there are some bug fixes since then]

While this still might unfortunately lock people using unconventional browsers etc out (just the latest of many complaints on HackerNews), we can use this to only protect our search pages. Most of our traffic comes directly from Google to an individual item detail page, which we can now leave completely out of it. We have complete control of allow-listing traffic based on whatever characteristics, when to present the challenge, etc. And it turns out we had a peer at another institution who had taken this approach and found it successful, so that was encouraging.

How it works: Overview

While typical documented Turnstile usage involves protecting form submissions, we actually want to protect certain urls, even when accessed via GET. Would this actually work well? What’s the best way to implement it?

Fortunately, when asking around on a chat for my professional community of librarian and archivist software hackers, Joe Corall from Lehigh University said they had done the exact same thing (even in response to the same problem, bots combinatorially exploring every possible facet value), and had super usefully written it up, and it had been working well for them.

Joe’s article and the flowchart it contains is worth looking it. His implementation is as a Drupal plugin (and used in at least several Islandora instances); the VuFind library discovery layer recently implemented a similar approach. We have a Rails app, so needed to implement it ourselves — but with Joe paving the way (and patiently answering our questions, so we could start with the parameters that worked for him), it was pretty quick work, bouyed by the confidence this approach wasn’t just an experiment in the blue, but had worked for a similar peer.

Joe allow-listed certain client domain names based on reverse IP lookup, but I’ve started without that, not wanting the performance hit on every request if I can avoid it. Joe also allow-listed their “on campus” IPs, but we are not a university and only have a few staff “on campus” and I always prefer to show the staff the same thing our users are seeing — if it’s inconvenient and intolerable, we want to feel the pain so we fix it, instead of never even seeing the pain and not knowing our users are getting it!

I’m going to explain and link to how we implemented this in a Rails app, and our choices of parameters for the various parameterized things. But also I’ll tell you we’ve written this in a way that paves the way to extracting to a gem — kept everything consolidated in a small number of files and very parameterized — so if there’s interest let me know. (Code4Lib-ers, our slack is a great place to get in touch, I’m jrochkind).

Ruby and Rails details, and our parameters

Here’s the implementing PR. It is written in such a way to keep the code conslidated for future gem extraction, all in the BotDetectController class, which means kind of weirdly there is some code to inject in class methods in the controller. While it does turnstile now, it’s written with variable/class names such that analagous products could be made available.

Rack-attack to meter

We were already using rack-attack to rate-limit. We added a “track” monitor with our code to decide when a client had passed a rate-limit gate to require a challenge. We start with allowing 10 requests per 12 hours (Joe at Lehigh did 20 per 24 hours), batched together in subnets. (Joe did subnets too, but we do smaller /24 (ie x.y.z.*) for ipv4 instead of Joe’s larger /16 (x.y.*.*)).

Note that rack-attack does not use sliding/rolling-windows for rate limits, but fixed windows that reset after window period. This makes a difference especially when you use such a long period as we are, but it’s not a problem with our very low count per period, and it does keep the RAM extremely effiicent (just an integer count per rate limit bucket).

When the rate limit is reached, the rack-attack block just sets a key/value in the rack_env to tell another component that a challenge is required. (setting in the session may have worked, but we want to be absolutely sure this will work even if client is not storing cookies, and this is really only meant as this-request state, so rack env seemed the good way to set state in rack-attack that could be seen in a rails controller)

Rails before_action filter to enforce challenge

There’s a Rails before_action filter that we just put on the application-wide ApplicationController, that looks for the “bot challenge key” required in the rack env — if present, and there isn’t anything in the session saying they have already passed a bot challenge, then we redirect to a “challenge” page, that will display/activate Turnstile.

We simply put the original/destination URL in a query param on that page. (And include logic to refuse to redirect to anything but a relative path on same host, to avoid any nefarious uses).

The challenge controller

One action in our BotDetectController just displays the turnstile challenge. The cloudflare turnstile callback gives us a token we need to verify server-side with turnstile API to verify challenge was really passed.

the front-end does a JS/xhr/fetch request to the second action in our BotDetectController. The back-end verify action makes the API call to turnstile, and if challenge passed, sets a value in Rails (encrypted and signed, secure) session with time of pass, so the before_action guard can give the user access.

if the JS in front gets a go-ahead from back-end, it uses JS document.replace to go to original destination. This conveniently removes the challenge page from the user’s browser history, as if it never happened, browser back button still working great.

In most cases the challenge page, if non-interactive, wont’ be displayed for more than a few seconds. (the language has been tweaked since these screenshots).

We currently have a ‘pass’ good for 24 hours — once you pass a turnstile challenge, if your cookies/session are intact, you won’t be given another one for 24 hours no matter how much traffic. All of this is easily configurable.

If the challenge DOES fail for some reason, the user may be looking at the Challenge page with one of two kinds of failures, and some additional explanatory text and contact info.

Limitations and omissions

This particular flow only works for GET requests. It could be expanded to work for POST requests (with an invisible JS created/submitted form?), but our initial use case didn’t require it, so for now the filter just logs a warning and fails for POST.

This flow also isn’t going to work for fetch/ajax requests, it’s set up for ordinary navigation, since it redirects to a challenge then redirects back. Our use case is only protecting our search pages — but the blacklight search in our app has a JS fetch for “facet more” behavior. Couldn’t figure out a good/easy way to make this work, so for now we added an exemption config, and just exempt requests to the #facet action that look like they’re coming from fetch. Not bothered that an “attacker” could escape our bot detection for this one action; our main use case is stopping crawlers crawling indiscriminately, and I don’t think it’ll be a problem.

To get through the bot challenge requires a user-agent to have both JS and cookies enabled. JS may have been required before anyway (not sure), but cookies were not. Oh well. Only search pages are protected by the bot challenge.

The Lehigh implementation does a reverse-lookup of the client IP, and allow-lists clients from IP’s that reverse lookup to desirable and well-behaved bots. We don’t do that, in part because I didn’t want the performance hit of the reverse-lookup. We have a Sitemap, and in general, I’m not sure we need bots crawling our search results pages at all… although I’m realizing as I write this that our “Collection” landing pages are included (as they show search results)… may want to exempt them, we’ll see how it goes.

We don’t have any client-based allow-listing… but would consider just exempting any client that has a user-agent admitting it’s a bot, all our problematic behavior has been from clients with user-agents appearing to be regular browsers (but obviously automated ones, if they are being honest).

Possible extensions and enhancements

We could possibly only enable the bot challenge when the site appears “under load”, whether that’s a certain number of overall requests per second, a certain machine load (but any auto-scaling can make that an issue), or size of heroku queue (possibly same).

We could use more sophisticated fingerprinting for rate limit buckets. Instead of IP-address-based, colleague David Cliff from Northeastern University has had success using HTTP user-agent, accept-encoding, and accept-language to fingerprint actors across distributed IPs, writing:

I know several others have had bot waves that have very deep IP address pools, and who fake their user agents, making it hard to ban.

We had been throttling based on the most common denominator (url pattern), but we were looking for something more effective that gave us more resource headroom.

On inspecting the requests in contrast to healthy user traffic we noticed that there were unifying patterns we could use, in the headers.

We made a fingerprint based on them, and after blocking based on that, I haven’t had to do a manual intervention since.

def fingerprint
result = “#{env[“HTTP_ACCEPT”]} | #{env[“HTTP_ACCEPT_ENCODING”]} | #{env[“HTTP_ACCEPT_LANGUAGE”]} | #{env[“HTTP_COOKIE”]}”
Base64.strict_encode64(result)
end

…the common rule we arrived at mixed positive/negative discrimination using the above

request.env["HTTP_ACCEPT"].blank? && request.env["HTTP_ACCEPT_LANGUAGE"].blank? && request.env["HTTP_COOKIE"].blank? && (request.user_agent.blank? || !request.user_agent.downcase.include?("bot".downcase))

so only a bot that left the fields blank and lied with a non-bot user agent would be affected

We could also base rate limit or “discriminators” for rate limit buckets on info we can look up from the client IP address, either a DNS or network lookup (performance worries), or perhaps a local lookup using the free MaxMind databases that also include geocoding and some organizational info.

Does it work?

Too early to say, we just deployed it!

I sometimes get annoyed when people blog like this, but being the writer, I realized that if I wait a month to see how well it’s working to blog — I’ll never blog! I have to write while it’s fresh and still interesting to me.

But encouraged that colleagues say very similar approaches have worked for them. Thanks again to Joe Corral for paving the way with a drupal implementation, blogging it, discussing it on chat, and answering questions! And all the other librarian and cultural heritage technologists sharing knowledge and collaboration on this and many other topics!

I can say that already it is being triggered a lot, by bots that don’t seem to get past it. This includes google bot and Meta-ExternalAgent (which I guess is AI-related; we have no particular use-based objections we are trying to enforce here, just trying to preserve our resources). While Google also has no reason to combinatorially explore every facet combination (and has a sitemap), I’m not sure if I should exempt known resource-considerate bots from the challenge (and whether to do so by trusting user-agent or not; our actual problems have all been with ordinary-browser-appearing user-agents).

by jrochkind at January 16, 2025 06:06 PM

LibraryThing (Thingology)

Two Hundred Benchmark Searches for Talpa Search

In the next few days we’re releasing an update to Talpa Search—a major jump in Talpa’s ability to find books and other media within library catalogs.

Today we’re released a set of 200 “benchmark,” or test searches. Together with hundreds more, these are the searches we use to assess Talpa’s quality, test particular tweaks and features, and track Talpa’s improvement over time.

This set is named “What’s that book?” It consists of 200 searches, each of which has a single best answer. For example, the best answer to the search “prince harry memoir” is Spare . They were created by LibraryThing staff, generally about their own books, or books they know well.

By “best answer” we don’t mean the questions are all easy, or even clear. Some examples:

The searches cover different types of searches:

Talpa handles some other broad types, such as date-restricted titles (“1980s teen films”), author searches (“Lisa Carey”) and subjects (“persian art”), but these don’t have a single best answer, so they’re not included in this set.

We’ve included typos and spelling mistakes, because a good system should be able to handle these:

We’ve tried to include different ways a patron might word their search:

To mirror patron’s interest, half the set are to new books, published 2023–2024. The other half are published before 2023. Although Talpa Search can handle movies and music, only book searches are included in this set.

That’s it!

For fun, we’ve hidden the answers so you can test yourself. If you want all the answers unhidden, click show all answers. The searches are also available as a text file, here.

If you find any problems, let us know!

The Queries

Show all answers

Question 1

Q: children’s book boy baked into a cake

A: Show answer

Score: 100 – Position: 1

Question 2

Q: retired female assassins

A: Show answer

Score: 100 – Position: 1

Question 3

Q: internet social history washington post reporter

A: Show answer

Score: 100 – Position: 1

Question 4

Q: popular serial killer novel set in missouri

A: Show answer

Score: 0 – Not in results list.

Question 5

Q: maine kids book with out of control donut machine

A: Show answer

Score: 80 – Position: 2

Question 6

Q: seventh Cormoran Strike book

A: Show answer

Score: 100 – Position: 1

Question 7

Q: children’s book about dutch resistance and windmills

A: Show answer

Score: 100 – Position: 1

Question 8

Q: tintin with yeti

A: Show answer

Score: 100 – Position: 1

Question 9

Q: new gamache mystery

A: Show answer

Score: 100 – Position: 1

Question 10

Q: World war 1 historical fantasy taking place in flanders

A: Show answer

Score: 30 – Position: 10

Question 11

Q: artemis fowl graphic novel after eternity code

A: Show answer

Score: 80 – Position: 2

Question 12

Q: orc buys coffee shop

A: Show answer

Score: 100 – Position: 1

Question 13

Q: flat stanley in france

A: Show answer

Score: 100 – Position: 1

Question 14

Q: new reagan bio

A: Show answer

Score: 70 – Position: 3

Question 15

Q: historical fiction set in irish hospital during the great flu

A: Show answer

Score: 100 – Position: 1

Question 16

Q: sequel to the sparrow

A: Show answer

Score: 100 – Position: 1

Question 17

Q: ya mystery, dark academia, black author, missing roommate

A: Show answer

Score: 30 – Position: 7

Question 18

Q: kids book where a boys best friend drowns

A: Show answer

Score: 80 – Position: 2

Question 19

Q: Historical fiction about black bookbinder in victorian era

A: Show answer

Score: 60 – Position: 4

Question 20

Q: Fiction, banned books in a southern town, little free library

A: Show answer

Score: 100 – Position: 1

Question 21

Q: History of time between Lincoln’s election and the Civil War by popular author

A: Show answer

Score: 0 – Not in results list.

Question 22

Q: recent hot memoir of tech journalist

A: Show answer

Score: 30 – Position: 11

Question 23

Q: patrick stewart bio

A: Show answer

Score: 100 – Position: 1

Question 24

Q: novel where four strangers trap two gay dads and their daughter in a summer cabin

A: Show answer

Score: 100 – Position: 1

Question 25

Q: that book about hunting a whale

A: Show answer

Score: 100 – Position: 1

Question 26

Q: Campus novel about undocumented student at harvard university

A: Show answer

Score: 0 – Not in results list.

Question 27

Q: academic book about “paradigm shifts” in science

A: Show answer

Score: 100 – Position: 1

Question 28

Q: throwing muses memoir

A: Show answer

Score: 100 – Position: 1

Question 29

Q: picture book with no words about a snowman

A: Show answer

Score: 100 – Position: 1

Question 30

Q: woman manages hotel on Nantucket that has a ghost

A: Show answer

Score: 100 – Position: 1

Question 31

Q: Recent memoir of author who was stabbed

A: Show answer

Score: 100 – Position: 1

Question 32

Q: children’s book about boy throwing things into a tree

A: Show answer

Score: 70 – Position: 3

Question 33

Q: steampunk in victorian london with Japanese watchmaker

A: Show answer

Score: 100 – Position: 1

Question 34

Q: Marvel universe novel about retired superhero turned private investigator

A: Show answer

Score: 100 – Position: 4

Question 35

Q: Fantasy fiction about books with different powers/magic

A: Show answer

Score: 0 – Not in results list.

Question 36

Q: book about gay friends and AIDs in 1980s Chicago

A: Show answer

Score: 100 – Position: 1

Question 37

Q: science fiction novel where Greek and Chinese science are both true

A: Show answer

Score: 0 – Not in results list.

Question 38

Q: vanderbeekers

A: Show answer

Score: 100 – Position: 1

Question 39

Q: Historical fantasy about servant with magical powers, set during Spanish Golden Age

A: Show answer

Score: 0 – Not in results list.

Question 40

Q: second roselynde chronicles

A: Show answer

Score: 80 – Position: 2

Question 41

Q: Fiction romance about a stand up comedian trying to figure out his breakup

A: Show answer

Score: 100 – Position: 1

Question 42

Q: french elephant lives in the city

A: Show answer

Score: 80 – Position: 2

Question 43

Q: barefoot contessa memoir

A: Show answer

Score: 100 – Position: 1

Question 44

Q: series with 3 orphan kids first book

A: Show answer

Score: 100 – Position: 1

Question 45

Q: estranged, grieving brothers in ireland

A: Show answer

Score: 30 – Position: 15

Question 46

Q: obituaries from an AI

A: Show answer

Score: 80 – Position: 2

Question 47

Q: third sylvia day crossfire book

A: Show answer

Score: 100 – Position: 1

Question 48

Q: New sci fi about robot girlfriend

A: Show answer

Score: 100 – Position: 1

Question 49

Q: men coming out of the attic

A: Show answer

Score: 80 – Position: 2

Question 50

Q: late show host cookbook

A: Show answer

Score: 0 – Not in results list.

Question 51

Q: Literary fiction about queer iranian immigrant guy

A: Show answer

Score: 50 – Position: 5

Question 52

Q: picture book about ducks in boston

A: Show answer

Score: 100 – Position: 1

Question 53

Q: book about AI from One Useful Thing guy

A: Show answer

Score: 80 – Position: 2

Question 54

Q: Fiction about teen girl boxers in Reno, nevada

A: Show answer

Score: 0 – Not in results list.

Question 55

Q: Exploration of plants’ intelligence, nonfiction

A: Show answer

Score: 100 – Position: 1

Question 56

Q: Mystery novel about girl filming documentary about her missing mother

A: Show answer

Score: 30 – Position: 14

Question 57

Q: memoir of prince harry

A: Show answer

Score: 100 – Position: 1

Question 58

Q: Non-fiction book about widow discovering husband’s secret life

A: Show answer

Score: 60 – Position: 4

Question 59

Q: man turns into a cochroach

A: Show answer

Score: 100 – Position: 1

Question 60

Q: tintin book after Explorers on the Moon

A: Show answer

Score: 100 – Position: 1

Question 61

Q: amis book where soviets control britain

A: Show answer

Score: 100 – Position: 1

Question 62

Q: maine kids book where hero has a pet skunk

A: Show answer

Score: 0 – Not in results list.

Question 63

Q: Teenager disappears from Adirondack summer camp

A: Show answer

Score: 100 – Position: 1

Question 64

Q: most recent starlight’s shadow book

A: Show answer

Score: 50 – Position: 5

Question 65

Q: Fantasy about fox spirit girl in manchuria

A: Show answer

Score: 100 – Position: 1

Question 66

Q: Book where a girl has to walk or else she will die

A: Show answer

Score: 0 – Not in results list.

Question 67

Q: all art is alive comic

A: Show answer

Score: 0 – Not in results list.

Question 68

Q: popular history of period after bronze age collapse

A: Show answer

Score: 80 – Position: 2

Question 69

Q: nonfiction crack cocaine era

A: Show answer

Score: 100 – Position: 1

Question 70

Q: memoir by paul fussell’s son

A: Show answer

Score: 0 – Not in results list.

Question 71

Q: book that is an allegory of genesis story in california

A: Show answer

Score: 100 – Position: 1

Question 72

Q: woman in office accidentally gains access to coworkers’ emails

A: Show answer

Score: 100 – Position: 1

Question 73

Q: thriller with cat and el morgan

A: Show answer

Score: 0 – Not in results list.

Question 74

Q: Croatian children’s book about a brave shoemaker’s apprentice

A: Show answer

Score: 100 – Position: 1

Question 75

Q: murderbot book after network effect

A: Show answer

Score: 100 – Position: 1

Question 76

Q: Navalny book

A: Show answer

Score: 100 – Position: 3

Question 77

Q: that dystopian book with jonas and gabriel

A: Show answer

Score: 80 – Position: 2

Question 78

Q: sequel to jurrasic park book

A: Show answer

Score: 100 – Position: 1

Question 79

Q: ancient greek dream manual

A: Show answer

Score: 100 – Position: 1

Question 80

Q: french girl makes deal with devil becomes immortal

A: Show answer

Score: 100 – Position: 1

Question 81

Q: sequel to To Tame a Sheikh

A: Show answer

Score: 100 – Position: 1

Question 82

Q: Popular 2024 romance book about children’s librarian

A: Show answer

Score: 100 – Position: 1

Question 83

Q: dolly parton and sister cookbook

A: Show answer

Score: 100 – Position: 1

Question 84

Q: book that became american fiction movie

A: Show answer

Score: 0 – Not in results list.

Question 85

Q: picture book about axolotls

A: Show answer

Score: 100 – Position: 4

Question 86

Q: childrens books about chihuahua with giant ears second book

A: Show answer

Score: 80 – Position: 2

Question 87

Q: most recent kingsbridge book

A: Show answer

Score: 100 – Position: 1

Question 88

Q: historical fiction boston journalist wwii

A: Show answer

Score: 0 – Not in results list.

Question 89

Q: heinlein novel where kids are trapped on an alien planet

A: Show answer

Score: 100 – Position: 1

Question 90

Q: Memoir of a woman whos husband died while running a half marathon

A: Show answer

Score: 30 – Position: 9

Question 91

Q: sci-fi novel with telepathically-linked dogs in a medieval world

A: Show answer

Score: 80 – Position: 2

Question 92

Q: romcom with bartender and librarian

A: Show answer

Score: 0 – Not in results list.

Question 93

Q: persian epic about creation and the gods

A: Show answer

Score: 100 – Position: 1

Question 94

Q: New book on the history of hip hop

A: Show answer

Score: 100 – Position: 1

Question 95

Q: poems about taylor swift songs

A: Show answer

Score: 60 – Position: 4

Question 96

Q: prince harry memoir

A: Show answer

Score: 100 – Position: 1

Question 97

Q: jesus wife papyrus hoax

A: Show answer

Score: 100 – Position: 1

Question 98

Q: where’s my binkit

A: Show answer

Score: 100 – Position: 1

Question 99

Q: children’s book with grandmother and bowl of mush

A: Show answer

Score: 100 – Position: 1

Question 100

Q: bono memoir

A: Show answer

Score: 100 – Position: 1

Question 101

Q: ya comedic novel about beauty queens stranded on desert island

A: Show answer

Score: 100 – Position: 1

Question 102

Q: Nonfiction book about the science of successful communicators

A: Show answer

Score: 100 – Position: 1

Question 103

Q: recent popular historical fiction about women in vietnam war

A: Show answer

Score: 100 – Position: 1

Question 104

Q: sequel to 1177

A: Show answer

Score: 100 – Position: 1

Question 105

Q: children’s book about a girl who spies on her friends and takes notes

A: Show answer

Score: 100 – Position: 1

Question 106

Q: nurse in vietnam novel

A: Show answer

Score: 100 – Position: 1

Question 107

Q: College friends reunite for week in Maine

A: Show answer

Score: 100 – Position: 1

Question 108

Q: fourth magic treehouse

A: Show answer

Score: 100 – Position: 1

Question 109

Q: woman and iranian family argue about a house novel

A: Show answer

Score: 100 – Position: 1

Question 110

Q: Newest Tana French novel

A: Show answer

Score: 100 – Position: 1

Question 111

Q: stan lee graphic novel bio

A: Show answer

Score: 100 – Position: 1

Question 112

Q: roald dahl book with kid in gypsy caravan

A: Show answer

Score: 100 – Position: 1

Question 113

Q: book where all wheat and rice dies. it takes place in britain

A: Show answer

Score: 100 – Position: 1

Question 114

Q: ex-amazon employee memoir

A: Show answer

Score: 100 – Position: 2

Question 115

Q: The 22nd book in the Mitch Rapp series

A: Show answer

Score: 100 – Position: 1

Question 116

Q: recent ireland dystopia novel

A: Show answer

Score: 100 – Position: 1

Question 117

Q: rupaul new autobiography

A: Show answer

Score: 100 – Position: 1

Question 118

Q: 2024 fantasy novel slavic folklore baba yaga

A: Show answer

Score: 30 – Position: 17

Question 119

Q: british murder mystery where a fictional version of the author is a character

A: Show answer

Score: 50 – Position: 5

Question 120

Q: 2024 nonfiction about memory by memory researcher

A: Show answer

Score: 30 – Position: 9

Question 121

Q: recent book by calvin and hobbes creator

A: Show answer

Score: 80 – Position: 2

Question 122

Q: Funny book about single mom who creates popular OnlyFans account

A: Show answer

Score: 30 – Position: 14

Question 123

Q: contemporary novel about divorce in new york city, multiple narrators

A: Show answer

Score: 70 – Position: 3

Question 124

Q: challenger book by chernobyl author

A: Show answer

Score: 70 – Position: 3

Question 125

Q: romance novel about tv dating show with a plus sized lead

A: Show answer

Score: 100 – Position: 1

Question 126

Q: Memoir of time in psych ward, reading books by fellow mad women

A: Show answer

Score: 60 – Position: 4

Question 127

Q: collection with tower of babylon story

A: Show answer

Score: 100 – Position: 1

Question 128

Q: writing advice from steven king

A: Show answer

Score: 100 – Position: 1

Question 129

Q: crichton book with the nanotech

A: Show answer

Score: 100 – Position: 1

Question 130

Q: peter brown autobiography

A: Show answer

Score: 100 – Position: 1

Question 131

Q: Mystery set on a reality tv love island dating show

A: Show answer

Score: 50 – Position: 5

Question 132

Q: Retelling of Huck Finn from Jim’s point of view

A: Show answer

Score: 100 – Position: 1

Question 133

Q: Sequel to YA fantasy about home for magical misfit children

A: Show answer

Score: 0 – Not in results list.

Question 134

Q: novel Rushdie wrote after the knife attack

A: Show answer

Score: 80 – Position: 2

Question 135

Q: Romance about girl who loves pro golfer

A: Show answer

Score: 40 – Position: 6

Question 136

Q: second book of Meier’s Marginal Jew

A: Show answer

Score: 100 – Position: 1

Question 137

Q: living boy meets grandfather in ghost world

A: Show answer

Score: 70 – Position: 3

Question 138

Q: what is the third bond book?

A: Show answer

Score: 100 – Position: 1

Question 139

Q: gisele cookbook

A: Show answer

Score: 100 – Position: 2

Question 140

Q: romance with Derek Pender

A: Show answer

Score: 100 – Position: 1

Question 141

Q: Alphabetical essays on climate change

A: Show answer

Score: 30 – Position: 7

Question 142

Q: Robot girlfriend discovers autonomy

A: Show answer

Score: 100 – Position: 1

Question 143

Q: recent book about captain cook’s last voyage

A: Show answer

Score: 100 – Position: 1

Question 144

Q: Historical fiction about the panama canal construction and people involved

A: Show answer

Score: 30 – Position: 7

Question 145

Q: romance prince of england and son of the president

A: Show answer

Score: 100 – Position: 1

Question 146

Q: Female friends at D.C. boarding house duringheight of McCarthyism

A: Show answer

Score: 0 – Not in results list.

Question 147

Q: Magical realism, fiction, two boys who disappear for 6 months and can’t recall what happened

A: Show answer

Score: 0 – Not in results list.

Question 148

Q: literary novel about a life-long friendship of two video game developers

A: Show answer

Score: 100 – Position: 1

Question 149

Q: popular book on bronze-age collapse

A: Show answer

Score: 100 – Position: 1

Question 150

Q: buckbeak

A: Show answer

Score: 100 – Position: 1

Question 151

Q: recent book about REM

A: Show answer

Score: 80 – Position: 2

Question 152

Q: recent book on nuclear war

A: Show answer

Score: 100 – Position: 1

Question 153

Q: recent cartoon demon perspective book

A: Show answer

Score: 0 – Not in results list.

Question 154

Q: romance where justin has a reddit curse

A: Show answer

Score: 0 – Not in results list.

Question 155

Q: what is the sequel to Sunset of the Sabertooth

A: Show answer

Score: 100 – Position: 1

Question 156

Q: graphic novel youth group fights demons

A: Show answer

Score: 80 – Position: 2

Question 157

Q: New book from author of fleishman is in trouble

A: Show answer

Score: 100 – Position: 1

Question 158

Q: ya novel about young woman on pirate ship, mutiny

A: Show answer

Score: 30 – Position: 7

Question 159

Q: graphic novel boxer rebellion

A: Show answer

Score: 100 – Position: 1

Question 160

Q: Next book in sister holiday series

A: Show answer

Score: 0 – Not in results list.

Question 161

Q: last book in murakami rat series

A: Show answer

Score: 100 – Position: 1

Question 162

Q: Book about invisible woman whos brother is a suspect in a murder

A: Show answer

Score: 30 – Position: 19

Question 163

Q: fantasy with linus baker

A: Show answer

Score: 100 – Position: 1

Question 164

Q: girl finds necklace and meets pink bunny robot

A: Show answer

Score: 0 – Not in results list.

Question 165

Q: New Arthurian epic

A: Show answer

Score: 100 – Position: 1

Question 166

Q: that octopus friendship novel

A: Show answer

Score: 80 – Position: 2

Question 167

Q: orphan girl always looks on the bright side of things

A: Show answer

Score: 50 – Position: 5

Question 168

Q: khan academy book about AI

A: Show answer

Score: 100 – Position: 1

Question 169

Q: Murder mystery about three foster sisters and a body found in their foster home

A: Show answer

Score: 100 – Position: 1

Question 170

Q: beginning chapter book mystery series about a girl with a photographic memory

A: Show answer

Score: 30 – Position: 7

Question 171

Q: official biography of steve jobs

A: Show answer

Score: 100 – Position: 1

Question 172

Q: book about a serial killer in chicago during the chicago world’s fair

A: Show answer

Score: 100 – Position: 1

Question 173

Q: book where girl eats manna from heaven

A: Show answer

Score: 0 – Not in results list.

Question 174

Q: 2024 national book award novel

A: Show answer

Score: 100 – Position: 1

Question 175

Q: zombie book with “hungries”

A: Show answer

Score: 100 – Position: 1

Question 176

Q: Romance where she gets the expiration date of the relationships she starts

A: Show answer

Score: 100 – Position: 1

Question 177

Q: martha ballard mystery

A: Show answer

Score: 80 – Position: 2

Question 178

Q: last book in dune caladan trilogy

A: Show answer

Score: 100 – Position: 1

Question 179

Q: aliens arrive in medieval Germany

A: Show answer

Score: 100 – Position: 1

Question 180

Q: lesbian taxidermist in florida

A: Show answer

Score: 100 – Position: 1

Question 181

Q: book about witches in the suffrage movement

A: Show answer

Score: 100 – Position: 1

Question 182

Q: hellboy rpg

A: Show answer

Score: 100 – Position: 1

Question 183

Q: Pre-civil war philadelphia maid and abolotionist girl help enslaved girl escape

A: Show answer

Score: 0 – Not in results list.

Question 184

Q: graphic novel memoir set in a funeral home

A: Show answer

Score: 100 – Position: 1

Question 185

Q: second tintin book

A: Show answer

Score: 100 – Position: 1

Question 186

Q: dear sugar book

A: Show answer

Score: 100 – Position: 1

Question 187

Q: time travel bureaucracy in england

A: Show answer

Score: 100 – Position: 1

Question 188

Q: roadtrip to visit sites of political assassinations

A: Show answer

Score: 100 – Position: 1

Question 189

Q: second Molly american girl book

A: Show answer

Score: 80 – Position: 2

Question 190

Q: book about woman who lives in sand pit

A: Show answer

Score: 100 – Position: 1

Question 191

Q: taylor swift book be rolling stone reporter

A: Show answer

Score: 0 – Not in results list.

Question 192

Q: sam bankman fried book

A: Show answer

Score: 100 – Position: 1

Question 193

Q: irish novel with tightrope walk between Twin Towers

A: Show answer

Score: 100 – Position: 1

Question 194

Q: children’s book with easter bunny mother

A: Show answer

Score: 100 – Position: 1

Question 195

Q: the fifth frontiers saga book

A: Show answer

Score: 100 – Position: 1

Question 196

Q: dystopian novel with big brother

A: Show answer

Score: 80 – Position: 2

Question 197

Q: historical murder mystery set in san francisco with a gay ex-cop

A: Show answer

Score: 0 – Not in results list.

Question 198

Q: cerulean sea sequel

A: Show answer

Score: 100 – Position: 1

Question 199

Q: Magical realism-western about Mexican man in Texas trying to save his family

A: Show answer

Score: 0 – Not in results list.

Question 200

Q: literary fiction about the making of the Oxford English Dictionary

A: Show answer

Score: 60 – Position: 4

by Tim at January 16, 2025 05:56 PM

David Rosenthal

A Prophet Of The Web

While doing the research for a future talk, I came across an obscure but impressively prophetic report entitled Accessibility and Integrity of Networked Information Collections that Cliff Lynch wrote for the federal Office of Technology Assessment in 1993, 32 years ago. I say "obscure" because it doesn't appear in Lynch's pre-1997 bibliography.

To give you some idea of the context in which it was written, unless you are over 70, it was more than half your life ago when in November 1989 Tim Berners-Lee's browser first accessed a page from his Web server. It was only about the same time that the first commercial, as opposed to research, Internet Service Providers started with the ARPANET being decommissioned the next year. Two years later, in December of 1991, the Stanford Linear Accelerator Center put up the first US Web page. In 1992 Tim Berners-Lee codified and extended the HTTP protocol he had earlier implemented. It would be another two years before Netscape became the first browser to support HTTPS. It would be two years after that before the ITEF approved HTTP/1.0 in RFC 1945. As you can see, Lynch was writing among the birth-pangs of the Web.

Although Lynch was insufficiently pessimistic, he got a lot of things exactly right. Below the fold I provide four out of many examples.

Page numbers refer to the PDF, not to the original. Block quotes without a link are from the report.

Disinformation

Page 66
When discussing the "strong bias in the Internet user community to prefer free information sources" he was, alas, prescient although it took more than "a few years":
The ultimate result a few years hence — and it may not be a bad or inappropriate response, given the reality of the situation — may be a perception of the Internet and much of the information accessible through it as the "net of a million lies", following science fiction author Vernor Vinge's vision of an interstellar information network characterized by the continual release of information (which may or may not be true, and where the reader often has no means of telling whether the information is accurate) by a variety of organizations for obscure and sometimes evil reasons.
The Vernor Vinge reference is to A Fire Upon the Deep:
In the novel, the Net is depicted as working much like the Usenet network in the early 1990s, with transcripts of messages containing header and footer information as one would find in such forums.
The downsides of a social medium to which anyone can post without moderation were familiar to anyone who was online in the days of the Usenet:
Usenet is culturally and historically significant in the networked world, having given rise to, or popularized, many widely recognized concepts and terms such as "FAQ", "flame", sockpuppet, and "spam".
...
Likewise, many conflicts which later spread to the rest of the Internet, such as the ongoing difficulties over spamming, began on Usenet.:
"Usenet is like a herd of performing elephants with diarrhea. Massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it."

— Gene Spafford, 1992
Earlier in the report Lynch had written (Page 23):
Access to electronic information is of questionable value if the integrity of that information is seriously compromised; indeed, access to inaccurate information, or even deliberate misinformation, may be worse than no access at all, particularly for the naive user who is not inclined to question the information that the new electronic infrastructure is offering.
This resonates as the wildfires rage in Los Angeles.

Information Doesn't Want To Be Free

Although Tim Berners-Lee's initial HTTP specification included the status code 402 Payment Required:
The parameter to this message gives a specification of charging schemes acceptable. The client may retry the request with a suitable ChargeTo header.
the Web in 1993 lacked paywalls. But Lynch could see them coming (Page 22):
There is a tendency to incorrectly equate access to the network with access to information; part of this is a legacy from the early focus on communications infrastructure rather than network content. Another part is the fact that traditionally the vast bulk of information on the Internet has been publicly accessible if one could simply obtain access to the Internet itself, figure out how to use it, and figure out where to locate the information you wanted. As proprietary information becomes accessible on the Internet on a large scale, this will change drastically. In my view, access to the network will become commonplace over the next decade or so, much as access to the public switched telephone network is relatively ubiquitous today. But in the new "information age" information will not necessarily be readily accessible or affordable;
The current RFC 9110 states:
The 402 (Payment Required) status code is reserved for future use.
Instead today's Web is infested with paywalls, each with their own idiosyncratic user interface, infrastructure, and risks.

The Death Of "First Sale"

Lynch understood the highly consequential nature of the change in the business model of paid information access from purchasing a copy to renting access to the publisher's copy; from a legal framework of copyright and the "first sale" doctrine, to one of copyright and contract law (Page 30):
Now, consider a library acquiring information in an electronic format. Such information is almost never, today, sold to a library (under the doctrine of first sale); rather, it is licensed to the library that acquires it, with the terms under which the acquiring library can utilize the information defined by a contract typically far more restrictive than copyright law. The licensing contract typically includes statements that define the user community permitted to utilize the electronic information as well as terms that define the specific uses that this user community may make of the licensed electronic information. These terms typically do not reflect any consideration of public policy decisions such as fair use, and in fact the licensing organization may well be liable for what its patrons do with the licensed information.
The power imbalance between publishers and their customers is of long standing, and it especially affects the academic literature. In 1989 the Association of Research Libraries published Report of the ARL Serials Prices Project:
The ARL Serials Initiative forms part of a special campaign mounted by librarians in the 1980s against the high cost of serials subscriptions. This is not the first time that libraries have suffered from high serial prices. For example, in 1927 the Association of American Universities reported that:
"Librarians are suffering because of the increasing volume of publications and rapidly rising prices. Of special concern is the much larger number of periodicals that are available and that members of the faculty consider essential to the successful conduct of their work. Many instances were found in which science departments were obligated to use all of their allotment for library purposes to purchase their periodical literature which was regarded as necessary for the work of the department"
The oligopoly rents extracted by academic publishers have been a problem for close on a century, if not longer! Lynch's analysis of the effects of the Web's amplification of this power imbalance is wide-ranging, including (Page 31):
Very few contracts with publishers today are perpetual licenses; rather, they are licenses for a fixed period of time, with terms subject to renegotiation when that time period expires. Libraries typically have no controls on price increase when the license is renewed; thus, rather than considering a traditional collection development decision about whether to renew a given subscription in light of recent price increases, they face the decision as to whether to lose all existing material that is part of the subscription as well as future material if they choose not to commit funds to cover the publisher's price increase at renewal time.
Thus destroying libraries' traditional role as stewards of information for future readers. And (Page 30):
Of equal importance, the contracts typically do not recognize activities such as interlibrary loan, and prohibit the library licensing the information from making it available outside of that library's immediate user community. This destroys the current cost-sharing structure that has been put in place among libraries through the existing interlibrary loan system, and makes each library (or, perhaps, the patrons of that library) responsible for the acquisitions cost of any material that is to be supplied to those patrons in electronic form. The implications of this shift from copyright law and the doctrine of first sale to contract law (and very restrictive contract terms) is potentially devastating to the library community and to the ability of library patrons to obtain access to electronic information — in particular, it dissolves the historical linkage by which public libraries can provide access to information that is primarily held by research libraries to individuals desiring access to this information. There is also a great irony in the move to licensing in the context of computer communications networks — while these networks promise to largely eliminate the accidents of geography as an organizing principle for inter-institutional cooperation and to usher in a new era of cooperation among geographically dispersed organizations, the shift to licensing essentially means that each library contracting with a publisher or other information provider becomes as isolated, insular organization that cannot share its resources with any other organization on the network.

Surveillance Capitalism

Lynch also foresaw the start of "surveillance capitalism" (Page 60):
we are now seeing considerable use of multi-source data fusion: the matching and aggregation of credit, consumer, employment, medical and other data about individuals. I expect that we will recapitulate the development of these secondary markets in customer behavior histories for information seeking in the 1990s; we will also see information-seeking consumer histories integrated with a wide range of other sources of data on individual behavior.

The ability to accurately, cheaply and easily count the amount of use that an electronic information resource receives (file accesses, database queries, viewings of a document, etc.) coupled with the ability to frequently alter prices in a computer-based marketplace (particularly in acquire on demand systems that operate on small units of information such as journal articles or database records, but even, to a lesser extent, by renegotiating license agreements annually) may give rise to a number of radical changes. These potentials are threatening for all involved.
He described search-based advertising (Page 61)
The ability to collect not only information on what is being sought out or used but also who is doing the seeking or using is potentially very valuable information that could readily be resold, since it can be used both for market analysis (who is buying what) and also for directed marketing (people who fit a certain interest profile, as defined by their information access decisions, would likely also be interested in new product X or special offer Y). While such usage (without the informed consent of the recipient of the advertising) may well offend strong advocates of privacy, in many cases the consumers are actually quite grateful to hear of new products that closely match their interests. And libraries and similar institutions, strapped for revenue, may have to recognize that usage data can be a valuable potential revenue source, no matter how unattractive they find collecting, repackaging and reselling this information.
Of course, it wasn't the libraries but Google, spawned from the Stanford Digital Library Project, which ended up collecting the information and monetizing it. And the power imbalance between publishers and readers meant that the reality of tracking was hidden (Page 63):
when one is accessing (anonymously or otherwise) a public-access information service, it is unclear what to expect, and in fact at present there is no way to even learn what the policy of the information service provider is.

by David. (noreply@blogger.com) at January 16, 2025 04:00 PM

Peter Murray

Issue 103: Time Standards

This week, I'm going to tug on time. This follows the last item in last week's issue of Thursday Threads: The Clock that Made Power Grids Possible. Two years ago, I also published an issue about time, pointing to articles about eliminating the leap second, time standards on the moon, and observational humor on how we might explain our concept of time to aliens. That last one might form the thread that I tug on in the next issue because it treads on how whether our digital selves will stand the test of time.

This week:

Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page. If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.

Paris' City-wide Synchronized Clock

[The Paris Pneumatic Clock] system was created in 1880 by Austrian engineer Victor Popp – and just 5 years later, thousands of these clocks were placed all over the city – in hotels, train stations, houses, schools and public streets. We modeled this incredible system and the special machine at the heart of it, to show you how a series of underground pipes and mechanical clocks kept an entire city in sync.
The Incredible Paris Pneumatic Clock System from 1880, Primal Nebula, 24-Feb-2024

The 8-minute video companion to the above article is great to watch, too. This is a marvel of engineering — synchronizing the clocks of a whole city through puffs of air traveling through pipes. This system—accurate to a minute—was just 35 years before the sub-second precision required to synchronize the power grid, as described at the end of last week's issue.

Time is very different in Kathmandu

Most of the world is on a whole number of hours before or after UTC. About a fifth of the world by population is on a half-hour offset from UTC; in particular, India is 5h30m ahead of UTC. Nepal is 5h45m ahead of UTC
Australia/Lord_Howe is the weirdest timezone, SSO Ready blog, undated

I first encountered this when setting up a Zoom meeting for colleagues in Kathmandu. While most countries neatly set their clocks to full hour offsets (or, as noted in the quote above, a half-hour offset), Nepal ticks to its own clock with a 5-hour and 45-minute offset from UTC. It's as if Nepal took a look at the standard time zones and said, "Why be ordinary when you can add a twist?" Imagine trying to schedule a call back home, perplexed as you reconcile not just the time difference but—and here's the kicker—those extra 15 minutes that make Nepal unique.

Moon GPS is Coming

NASA and its partners in Europe and Japan are developing lunar satnav concepts that could be deployed by the end of the 2020s. In July, China’s National Space Administration (CNSA) unveiled its plans for a constellation of 21 communications and navigation satellites to support its lunar aspirations.
Moon GPS Is Coming: Nations and companies are ramping up their efforts to deploy the first satnav on the moon to support a flurry of planned missions there, Wired, 4-Sep-2024

The Thursday Threads issue two years ago talked about the need to keep accurate on the moon. Following an announcement from the White House early in 2024 directing NASA to create a time standard for the moon, U.S., European, and Chinese efforts are underway to make that happen.

What Do A.M. and P.M. Stand For?

If you know how to tell time, you probably understand and use a.m. and p.m., and you might even know the terms come from Latin phrases. But do you know what exactly those phrases are, or what they mean in English?
What Do A.M. and P.M. Stand For?, Mental Floss, 4-Apr-2024

File this away for use at parties...

This Week's Troublemaker: Pickle

Tuxedo cat with white paws and chest lounging on a chair with a red harness and leash, exuding a relaxed and attentive demeanor.

So let's talk about the third cat in the house (after Alan in the last issue and Mittens in the issue before). This is Pickle, a black-and-white Tuxedo cat with a drive for food that I've never witnessed in another cat. Two stories from one recent afternoon: First, when my wife got home from the grocery store, Pickle grabbed the bag of doughnuts from a canvas bag and made off with a big chunk of a long-john. Then, when she was fixing dinner, Pickle jumped on the counter and made off with a hunk of steak. My wife chased her around the dining room table, through the living room, and up the stairs to my daughter’s room. I rushed to follow, and we trapped Pickle between the headboard and the wall. My wife thinks the cat woofed down a sizable chunk of meat before we could catch her.

That, ladies and gentlemen, is Pickle.

by Peter Murray at January 16, 2025 05:00 AM

January 15, 2025

Hugh Rundle

You should get a blog

I run some lightweight privacy-respecting self-hosted analytics for my blog, so I know what my most popular posts were in 2024. It's hardly surprising that many of these were also published last year, but they include one from 2013 and another from 2018. Having a quick peek at my stats reminded me that the blog content that is most appreciated, shared and read is often not what you might think and some posts retain value over time. My most popular posts last year include a couple of write-ups of conferences I attended, a conference talk I gave, a highly personal reflection on the biggest single-day wave of people moving from Twitter to Mastodon (which is far and away the most-read blog post I've ever written), as well as a few technical descriptions of how to do specific things, and a post I wrote 11 years ago about 3D printers.

What I personally appreciate a great deal is blog posts outlining exactly how some technical thing works, or a step by step description of how someone did something. This also happens to be some of the most consistently popular content on my own blog - the top post last year is something of no relevance to my day job and about a topic I am not really an expert in. But it explains step by step how I did something that a lot of people want to know how to do, so it's useful to the world.

I want to read more stuff like this - helpful tips from human beings who aren't trying to sell anything and aren't just posting to get a cheap reaction on social media. You should get a blog.

I don't know what to write about

The great thing about having your own blog is that there are no rules you can write about whatever you want. I started off mostly throwing my uninformed opinions about librarianship into the void, but over the years I've written a lot of different things, just as the list above shows.

Liam's blog was originally a food blog but is now quite eclectic: brief commentary on the New South Wales planning scheme, observations about how emergency management differs between countries, notes about what he's reading, and the occasional nori roll recipe.

Ed mostly posts about technology but sometimes shares a whimsical photograph.

Julia specialises in incredibly detailed explanations of how various computer things work but sometimes she'll post about crochet patterns or how to write zines.

Jessamyn writes about whatever is on her mind which could mostly be described as "libraries and open culture" but covers a lot of ground.

Nobody would be interested in anything I have to say

Are things interesting to you? Did you learn something today? Congratulations, you have something to write about that will be interesting to someone else. I've written blog posts that I thought were well crafted and interesting, and have hardly any views. I've bashed out some half-arsed thoughts off the top of my head, and they've ended up being the most popular things I've ever published. Who knows, man? Just try not to defame anyone, and then put it out there.

I don't have time to blog regularly

Either do I, that's why I don't publish posts regularly. The same rule applies here as for what to write about - there are no rules! Liam publishes nothing for months, and then pumps out four posts in a week. I've had wildly different posting schedules over the years. Adam Mastroianni would rather trash his draft and publish nothing than post something he's not happy with just to keep a schedule. Ashley published one post a week for 39 weeks and then took five months off.

I'm worried big tech will steal my work for their AI

Ten years ago I wrote about how I got over my fears about my blog posts being used for corporate profit. I think the same applies to LLMs, even though they don't attribute their sources. If writing is how you make your livelihood then different rules apply.

I already post on Facebook/LinkedIn/Mastodon

Great, I can't read it on Facebook or LinkedIn, because they're enclosed spaces that require a login to read them. This is also why I strongly urge against using something like Medium, which isn't really a blogging platform since it requires logging in to read posts. Mastodon and other fediverse software and platforms are better, but they're not blogs. You think differently when you're posting longer form content, using a platform that's designed for that.

Ok I'm convinced! How do I get a blog?

I recommend something that provides

All the suggestions below offer these.

This is not actually necessary, but I strongly suggest you set up your own domain name (e.g. example.com) and set it to auto-renew so you don't accidentally lose it. Some webhosts provide domain registration as well, or you can do it separately. Somewhere like Gandi will get you started. Don't use GoDaddy.

Once you've done that, it all depends on how you prefer to do things, and what your budget is. Earlier this week I asked my Mastodon bubble for suggestions for first-time bloggers - thanks to everyone for your suggestions!

If you can't be bothered reading everything below or it's too hard to decide, get a WordPress blog with Reclaim Hosting.

I just want to throw a normal file into Google Drive or Dropbox and have it magically turn into a blog post

USD$5 per month

I haven't used this myself, but Blot might be exactly what you're looking for. The demo on the website looks pretty impressive to me, and the price is attractive. You can use your own custom domain with Blot and it is fully managed for you. Once you've configured Blot, you publish by adding files and folders to a synced folder in Google Drive, Dropbox, or a Git repository, so you can use an application you already know to actually write your content, like MS Word or your favourite text editor.

I like the idea of hacking my own HTML file but don't care about having my own domain name

FREE

If you're too young to remember Geocities, or old enough and are still mourning its demise, then Neocities might be for you. Neocities is designed to be the 2020s version of Geocities: you write raw HTML and the example sites look kinda out there and glitchy because that's the point.

I want to use something with a WYSIWYG interface and a lot of support options

~AUD$5-$20 per month

A lot of people recommended hosted WordPress as the best option for most people. See my note further down about WordPress.com and why I do not recommend WordPress.com as a host. Whilst at the time of writing you may hear that "the WordPress world is in turmoil right now", the reality is that this is extremely unlikely to impact most owners of hosted WordPress sites: the argument is within the WordPress developer community and however it is resolved, it's in everyone's interest for WordPress users to barely notice and it's a piece of openly-licensed software rather than a platform that can just be switched off.

Reclaim Hosting comes highly recommended by many people over time. They're focussed on higher education in the USA but anyone can sign up for a personal plan at very attractive pricing. This is probably the best option for most people.

If you want something based in Australia with local support, a couple of different people recommended VentraIP. This will be more expensive than Reclaim even after accounting for currency exchange rates.

There are many other options - look for "Hosted WordPress". Generally what you get is "shared hosting" with "CPanel", which means your blog will be in a separated section of a web server also hosting several other websites, and you can use a web interface to configure things like the domain you use for your blog. Your chosen host will usually have good documentation on how to get set up.

I want to write in markdown and then press publish

USD$9 per month

Ghost was originally a Kickstarter project by a former WordPress core developer, but has developed quickly from there. Ghost can be used for both websites (e.g. 404 Media) and newsletters (e.g. Mita Williams' University of Winds). Ghost takes the clean and simple markdown-based approach of static site generators but removes all the nerdy futzing so it's more like the WordPress experience. Indeed whilst writing in markdown was originally the only way to use Ghost, it now offers a rich WYSIWYG writing interface as well, so you can compare Ghost and hosted WordPress to see which one you prefer. I published this blog using self-hosted Ghost for a while.

...but I don't want to pay for it

FREE

Publii is an open source static site generator (see below for more on this), but you can connect it to a free GitHub or GitLab Pages account to publish. Interestingly, publishing and configuring Publii works as a desktop application rather than a web interface, which is a little different to most of the options listed here and makes it a lot simpler for normal people than a commandline based system like I describe below.

Blogger is a free service from Google. It's quite bare-bones and really geared towards posting content to attract people to view ads where you share the revenue with Google, so the primary use of Blogger is by spam-blogs. As a Google product you also never know when it will join the Google graveyard. But if you're looking for something basic and free, Blogger was nominated by a couple of people in my unscientific survey, and you will be joining successful and interesting bloggers like Aaron Tay.

I need a frustrating hobby

FREE to ~ $USD10

If you're keen to have more control, you can look into using a static site generator (SSG). An SSG is essentially a commandline script that takes a bunch of input files and outputs a website - HTML files in directories, with all the relevant images, CSS and JavaScript and everything pointing to the right place. Different SSGs use different templating languages, but pretty much all of them use markdown in the page content file and convert it into HTML using an appropriate template.

My blog is made using Zola, but I've previously used Eleventy. To publish with an SSG you either need to use GitLab Pages (which works with most SSGs) or GitHub Pages (which only works with the Jekyll SSG); or have control over some space on some kind of webserver - either shared hosting (something with CPanel), or a standalone virtual private server (VPS). There's a bit of technical work involved to publish this way, so it's not surprising that a great many blogs published with SSGs start off with a couple of posts about how they set up their blog, and sometimes end there. If you want to procrastinate with your SSG setup instead of writing blog posts, this could be a great choice.

Why didn't you recommend the services I've heard of?

The WordPress world is currently experiencing some difficulties, after one of the original creators of WordPress, and owner of WordPress.com, Matt Mullenweg, seems to have taken leave of his senses. His behaviour has been so erratic over the last month that I cannot recommend using his company (WordPress.com/Automattic) to host your blog. I probably wouldn't have recommended this anyway, as I think Automattic is pretty aggressive at upselling to unsuspecting new users. Since the WordPress software is openly licensed, anyone else can use it and provide hosting for you, as I outlined above. The software itself is very robust and several of the other software options I suggest provide exports using the WordPress XML export standard.

Wix is the Yahoo Mail of blogging platforms, with a laggy, busy interface that is constantly upselling to you. It also doesn't provide an export function - if you start a Wix site you're essentially stuck paying Wix until they go bankrupt and your blog is deleted forever.

Squarespace was recommended to me as a good option that "just works" when I asked for suggestions on Mastodon. Squarespace does provide exports using the WordPress xml standard. At AUD$16 per month I don't consider Squarespace a good deal compared to the nearest alternative of hosted WordPress - it's not open source so the only host you can use for a Squarespace blog is Squarespace (although you can export your blog in the WordPress XML format to take it somewhere else). You're also at the mercy of Squarespace's corporate strategy.

What next?

Once you've set up your blog, you can add it to the list at ausglamr.newcardigan.org. Then every time you publish a blog post, it will be shared with the GLAMR world. You can add certain tags to your post if you don't want a particular post to be added to the Aus GLAMR feed.

Now get blogging!


by Hugh Rundle at January 15, 2025 12:00 AM

January 14, 2025

Ed Summers

Everything

A trail with snow and a dog

Archives have never collected everything, but everything can become archival.

This was a somewhat random grandiose utterance during a conversation today about social media archiving and the Records Continuum while thinking about Suzanne Briet.

January 14, 2025 05:00 AM

Lucidworks

Meet Lucidworks AI, the AI orchestration engine for search

Lucidworks AI empowers businesses to seamlessly integrate, manage, and optimize generative AI, driving innovation and efficiency while ensuring accuracy and responsible use.

The post Meet Lucidworks AI, the AI orchestration engine for search appeared first on Lucidworks.

by Lucidworks at January 14, 2025 02:48 AM

January 13, 2025

David Rosenthal

Storage Roundup

It is time for another roundup of topics in storage that have caught my eye recently. Below the fold I discuss the possible ending of the HAMR saga and various developments in archival storage technology.

Heat-Assisted Magnetic Recording

Unless you have been tracking storage technology for many years, it is hard to appreciate how long the timescales are. My go-to example for communicating this is Seagate's development of HAMR.

Seagate first demonstrated HAMR in 2002. In 2008 they published this graph, predicting HAMR would supplant Perpendicular Magnetic Recording (PMR) starting in 2009.

I first wrote skeptically about projections of HAMR's deployment twelve years ago. Seagate had just demonstrated HAMR at a terabit per square inch and predicted market entry in 2014.

I wrote again in 2013. In 2015 I wrote more about it. Then in 2016 I wrote about it again.

Seagate's 2018 HAMR roadmap
In 2018 I wrote about Chris Mellor's Seagate HAMRs out a roadmap for future hard drive recording tech::
Seagate has set a course to deliver a 48TB disk drive in 2023 using its HAMR (heat-assisted magnetic recording) technology, doubling areal density every 30 months, meaning 100TB could be possible by 2025/26. ... Seagate will introduce its first HAMR drives in 2020. ... a 20TB+ drive will be rolled out in 2020.
So in a decade the technology had gone from next year to the year after next. The year after next Jim Slater wrote HAMR don’t hurt ’em—laser-assisted hard drives are coming in 2020:
Seagate has been trialing 16TB HAMR drives with select customers for more than a year and claims that the trials have proved that its HAMR drives are "plug and play replacements" for traditional CMR drives, requiring no special care and having no particular poor use cases compared to the drives we're all used to.
But no, it would be another four years before we saw the first signs of HAMR drives in the market. In December 2024 Matthew Connatser reported that Seagate launches 32TB Exos M hard drive based on HAMR technology – Mozaic 3+ drives are the world’s first generally available HAMR HDDs:
Seagate’s biggest-ever hard drive is finally here, coming with 32TB of capacity courtesy of the company’s new HAMR technology (via Expreview).

It has almost been a year since Seagate said it had finally made a hard drive based on heat-assisted magnetic recording (HAMR) technology using its new Mozaic 3+ platform.
...
Exos drives based on Mozaic 3+ were initially released to select customers in small quantities, but now the general release is (nearly) here, thanks to mass production.
Note that the drives that are "(nearly) here" are still not available from Amazon, although they are featured on Seagate's web site. Kevin Purdy writes:
Drives based on Seagate's Mozaic 3+ platform, in standard drive sizes, will soon arrive with wider availability than its initial test batches. The driver maker put in a financial filing earlier this month (PDF) that it had completed qualification testing with several large-volume customers, including "a leading cloud service provider," akin to Amazon Web Services, Google Cloud, or the like. Volume shipments are likely soon to follow.

There is no price yet, nor promise of delivery, but you can do some wishful thinking on the product page for the Exos M, where 30 and 32TB capacities are offered. That's 3TB per platter, and up to three times the efficiency per terabyte compared to "typical drives," according to Seagate.
More indications that volume shipments could happen "next year" comes from Chris Mellor's WD’s HAMR switch could be closer than we think:
Intevac has said there is strong interest in its HAMR disk drive platter and head production machinery from a second customer, which could indicate that Western Digital is now involved in HAMR disk developments following Seagate’s move into volume production.

Intevac supplies its 200 Lean thin-film processing machines to hard disk drive media manufacturers, such as Seagate, Showa Denko and Western Digital. It claims more than 65 percent of the world’s HDD production relies on its machinery. The Lean 200 is used to manufacture recording media, disk drive platters, for current perpendicular magnetic recording (PMR) disks.

Intevac’s main customer for HAMR-capable 200 Lean machines is Seagate, which first embarked on its HAMR development in the early 2000s. It is only this year that a prominent cloud service provider has certified Seagate’s Mozaic 3 HAMR drives for general use, more than 20 years after development first started. The lengthy development period has been ascribed to solving difficulties in producing drives with high reliability from high yield manufacturing processes, and Intevac will have been closely involved in ensuring that its 200 Lean machines played their part in this.

Archival Media

Maybe 2025 will be the year I can finally bring my 12-year-long series about HAMR shipment schedules to a close, 26 years after Seagate started work on the technology. Why have I been HAMR-ing on Seagate all these years, and again now? Not to denigrate Seagate's engineering. Getting a HAMR drive into volume production that meets both the incredibly demanding standards for storage media reliability and performance, and the manufacturing yields needed for profit, is an extraordinarily difficult feat. It is not a surprise that it took a couple of decades.

My criticisms have been aimed at the storage industry's marketing and PR, which hypes developments that are still in the lab as if they are going to solve customers' problems "next year". And at the technology press, which took far too long to start expressing skepticism. Seagate's marketing eventually lost all credibility, with their predictions about HAMR becoming an industry joke.

The situation is far worse when it comes to archival media. The canonical article about some development in the lab starts with the famous IDC graph projecting the amount of data that will be generated in the future. It goes on to describe the density some research team achieved by writing say a gigabyte into their favorite medium in the lab. This conveys four false impressions:
Consumers already have an affordable, durable archival medium. As I have shown:
Surprisingly, with no special storage precautions, generic low-cost media, and consumer drives, I'm getting good data from CD-Rs more than 20 years old, and from DVD-Rs nearly 18 years old.
The market for DVD-R media and drives is gradually dying because they have been supplanted in the non-archival space by streaming, an illustration that consumers really don't care about archiving their data!

DNA Storage

In 2018's DNA's Niche in the Storage Market I imagined myself as the product marketing guy for an attempt to build a rack-scale DNA storage system, and concluded:
Engineers, your challenge is to increase the speed of synthesis by a factor of a quarter of a trillion, while reducing the cost by a factor of fifty trillion, in less than 10 years while spending no more than $24M/yr.
The only viable market for DNA storage is the data-center, and the two critical parameters are still the write bandwidth and the write cost. As far as I'm aware despite the considerable progress in the last 6 years both parameters are still many orders of magnitude short of what a system would have needed back then to enter the market. Worse, the last six years of data center technology development have increased the need for write bandwidth and reduced the target cost. DNA storage is in a Red Queen's Race and it is a long way behind.

Nevertheless, DNA's long-term potential as an archival storage medium justifies continued research. Among recent publications is Parallel molecular data storage by printing epigenetic bits on DNA by Cheng Zhang et al, which avoids the need to synthesize strands of DNA by attaching the bits to prexisting strands. In principle this can be done in parallel. As is traditional, they start by asserting:
The markedly expanding global data-sphere has posed an imminent challenge on large-scale data storage and an urgent need for better storage materials. Inspired by the way genetic information is preserved in nature, DNA has been recently considered a promising biomaterial for digital data storage owing to its extraordinary storage density and durability.
The paper attracted comment from, among others, The Register, Ars Technica and Nature. In each case the commentary included some skepticism. Here are Carina Imburgia and Jeff Nivala from the University of Washington team in Nature:
However, there are still challenges to overcome. For example, epigenetic marks such as methyl groups are not copied by the standard PCR techniques used to replicate DNA, necessitating a more complex strategy to preserve epi-bit information when copying DNA data. The long-term behaviour of the methyl marks (such as their stability) in various conditions is also an open question that requires further study.

Another challenge is that many applications require random access memory (RAM), which enables subsets of data to be retrieved and read from a database. However, in the epi-bit system, the entire database would need to be sequenced to access any subset of the files, which would be inefficient using nanopore sequencing. Moreover, the overall cost of the new system exceeds that of conventional DNA data storage and of digital storage systems, limiting immediate practical applications;
You have to read a long way into the paper to find that:
we stored 269,337 bits including the image of a tiger rubbing from the Han dynasty in ancient China and the coloured picture of a panda ... An automatic liquid handling platform was used to typeset large-scale data at a speed of approximately 40 bits s−1
This is interesting research but the skepticism in the commentaries doesn't exactly convey the difficulty and the time needed to scale from writing less than 40KB in a bit under 2 hours, to the petabyte/month rates (about 2.8TB every 2 hours) Facebook was writing a decade ago. This would be a speed-up of nearly 11 orders of magnitude to compete with decade-old technology.

Diamonds

Chinese boffins find way to use diamonds as super-dense and durable storage medium by Laura Dobberstein reports that:
The research, published in Nature Photonics, highlights that the breakthrough extends beyond density. It is said to offer significant improvements in write times – as little as 200 femtoseconds – and lives up to the promise that "a diamond is forever" by offering millions of years of allegedly maintenance-free storage. Diamonds are highly stable by nature and the the authors have claimed their medium could protect data for 100 years even if kept at 200°C.

High-speed readout is demonstrated with a fidelity of over 99 percent, according to the boffins.

Scientists have been eyeing diamonds as storage devices for a while. Researchers at City College of New York in 2016 claimed to be the first group to demonstrate the viability of using diamond as a platform for superdense memory storage.
These researchers, like so many others in the field, fail to understand that the key to success in archival storage is reducing total system cost. Long-lived but expensive media like diamonds are thus counter-productive.

Project Silica

I wrote about Microsoft's Project Silica last March, in Microsoft's Archival Storage Research. The more I think about this technology, the more I think it probably has the best chance of impacting the market among all the rival archival storage technologies:
The expensive part of the system is the write head, because it uses costly femtosecond lasers. The eventual system's economics will depend upon the progress made in cost-reducing the lasers.

by David. (noreply@blogger.com) at January 13, 2025 08:47 PM

LibraryThing (Thingology)

Author Interview: Kim Dower

Kim Dower

LibraryThing is pleased to sit down this month with poet and book publicist Kim Dower, who has worked with authors from Kristin Hannah to Paolo Coelho through her freelance literary publicity company, Kim-from-L.A. The City Poet Laureate of West Hollywood from October 2016 – October 2018, she is the author of five previous collections of poetry, including the bestselling I Wore This Dress Today for You, Mom (2022), which was praised by The Washington Post as a “fantastic collection.” Her first collection, Air Kissing on Mars (2010), was praised by the Los Angeles Times as “sensual and evocative… seamlessly combining humor and heartache.” Her work has appeared in literary publications such as Plume, Ploughshares, Rattle, The James Dickey Review, and Garrison Keillor’s “The Writer’s Almanac.” Her newest book, What She Wants: Poems on Obsession, Desire, Despair, Euphoria, will be published later this month by Red Hen Press. Dower sat down with Abigail to answer some questions about her work, and this new book.

What She Wants is your sixth poetry collection, and addresses the theme of obsessive love. What was the inspiration behind the book? Did it begin with a specific poem, a personal experience you wanted to explore, or something else?

I was reading an article (can’t remember where!) and came upon the word “Limerence.” I thought it was a beautiful sounding word, and it’s meaning, the state of being obsessively infatuated with someone, usually accompanied by delusions of or a desire for an intense romantic relationship with that person, fascinated me! I became obsessed with a word that meant to be obsessed!  I realized I had many finished poems and many in the works that fit into this category, so I built a collection based on this idea and the four stages of limerence: infatuation, crystallization, deterioration and ecstatic release.

What makes poetry unique, as a form of literary expression? Is it just the structure that makes it different from prose, or does it communicate in different ways?

Because poetry is the most concise form of language, good poems will stir our emotions with a clarity and intensity that immediately takes hold in the reader. There’s an emotional honesty in poems that connects poet to reader to create a shared experience. It has been said that prose is like walking and poetry is like dancing. A single, short poem has the power to simultaneously comfort and terrify. The poet W.H. Auden says, “poetry is the clear expression of mixed feelings,” and this is true for the poet as she writes and the reader as well.

Can you tell us a little bit about your writing process? How does a poet begin a poem?

I don’t know how all poets begin a poem, but I begin one after being stirred or moved by something, something personal or something I’ve read or overheard. Or something I think is funny. I often read a news headline or hear something on the radio as I’m driving that immediately says THIS IS A POEM! I was once driving, listening to the local news, and the headline, talking about a new public school decision was, “They’re Taking Chocolate Milk Off the Menu!” I pulled over and wrote a poem with that title. Later, after it was published, Garrison Keillor read it on “The Writer’s Almanac.” Poems are everywhere and I use everything I see and hear as a prompt – whether it’s something whimsical that strikes me, or something more profound like hearing a dead parent speak to me.

How has working with so many different authors, through your activities as a publicist, affected your writing?

The only way working hard at a “day” job has affected my writing is I’m very focused when I sit down to write. I’ve learned how to separate the two kinds of work and my brain and mind like knowing and appreciate the difference!

You were Poet Laureate of the city of West Hollywood for two years. What sort of things did you do as a poet laureate?

It was so much fun creating different activities, readings and events and introducing people to poetry who otherwise never thought about it. My favorite project was creating a collaborative poem with people in the city. The City of West Hollywood is committed to the arts and supported all of my ideas. We designed a large pad with three prompts and I spent a few months asking strangers at local bookstores, cafes, parks, to participate in reading a prompt and writing some lines. People really enjoyed it and I created a powerful poem consisting of all their lines called, “I Sing the Body West Hollywood.” We made posters. We celebrated!

Who are some of your favorite poets, and how has their work influenced your own?

I have so many favorites and so many whose work has influenced my own. More than influence – whose work has given me permission to build my own voice. I love Frank O’Hara – New York School of Poets – who’s influenced my “conversational” often breezy style while still packing a punch! William Carlos Williams, whose poetry has taught me to strive to make each poem a “fine machine.” Erica Jong, Sharon Olds and Kim Addonizio, for their passion, beauty, perceptions; Thomas Lux, Ron Padgett, Stephen Dunn, for humor mixed with deep emotion and insight. W.H. Auden for his style. This list could go on and on.

Tell us about your library. What’s on your own shelves?

I have hundreds and hundreds of books! I love all kinds of fiction, biographies, memoir, but upstairs, in my “Poetry Palace” I have only poetry – books I’ve kept and carried for 50 years – from college through today. I have a marvelous collection from Shakespeare to contemporary poets. Occasionally, just to calm myself, I will sit on the floor and take a random book off the shelf, read one or two poems, and place it back. This morning, for example, it was Diane di Prima’s book, The Poetry Deal. I read from it aloud. Now I can go on with my day.

What have you been reading lately, and what would you recommend to other readers?

I’m re-reading Vivian Gornick’s amazing, gorgeous memoir, Fierce Attachments, about her relationship with her mother. It’s a classic and each time I read it I discover something else – not only about her – but about myself.

I’m also re-reading Savage Beauty: The Life of Edna St Vincent Millay, a great poet and a fascinating star of poetry.

My poet friend, Nina Clements – who was also a Librarian – sent me a book called Monsters by Claire Dederer, which I’m enjoying, about the link between genius and monstrosity. How do we balance our love of some artists knowing the awful things they’ve done. This is a subject that constantly fascinates me.

And I’m slowly reading and loving the poems in Kim Addonizio’s new collection, Exit Opera.

by Abigail Adams at January 13, 2025 07:23 PM

Digital Library Federation

NDSA Interest and Working Groups 2024 Year in Review

Continuing the review of 2024, the following summaries the activities of the NDSA Interest and Working Groups activities. Please have a look at NDSA’s accomplishments – and feel free to reach out to NDSA with any questions on how you can get involved!

Interest Groups

Content Interest Group

For the year, the Content Interest Group met quarterly on the first Thursday of the month at 12:00pm EST.  We identified topics of interest through an ongoing but now defunct jamboard! We held 4 meetings utilizing various formats to facilitate the exchange of information.  In February we held a Content Exchange about how your organization manages the access levels of digital content in your reading rooms, physical and virtual.  Due to the success of the first Content Exchange, we held another one in May on how your organizations are increasing representation or including under-represented groups in your collections.  In August, we switched it up with presentations by metadata experts and discussion on understanding metadata standards and ways to incorporate them in our work.  Julie Shi, Digital Preservation Librarian, Scholars Portal, University of Toronto Libraries, discussed METS.  Leslie Johnston, Director of Digital Preservation, U.S. National Archives and Records Administration discussed PREMIS.  We winded the year down with our final meeting in November with a discussion using content to show impact of preservation or the risk of loss. 

Infrastructure Interest Group

For the Infrastructure Interest Group, 2024 began with a discussion led by the founders of the AEOLIAN Network, a project whose focus was “to investigate the role that AI can play to make born-digital and digitised cultural records more accessible to users.” Its outcomes included multiple workshops, case studies and journal publications, all of which focused on the larger community’s use of AI in this space. During its next two quarterly meetings, group members presented on their unique requirements and solutions surrounding content staging areas and repository ingest workflows. We listened to in-depth descriptions of workflows from the University of Alabama, Birmingham, the University Libraries at Ohio State University and finally the UW Digital Collections Center, University of Wisconsin, Madison. Our final meeting of the year introduced the Internet Archive’s Vanishing Culture: A Report on Our Fragile Cultural Record as a reading selection, from which several thought-provoking essays were brought forward for discussion by group members.

Standards & Practices Interest Group

The Standards & Practices Interest Group met quarterly on the first Monday of the month at 1:00 p.m. Eastern. Our topics for this year included: Digital preservation system migration; The language of the cloud; Selection for preservation; and Persistent identifiers and preservation. Michael Dulock presented his experience with migrating preservation systems at UC-Boulder for our January meeting. Our subsequent meetings were member-driven discussions. By far the most engaging and well-attended discussion was our exploration of “the language of the cloud” at our April 1, 2024 meeting. We shared experiences with outsourcing infrastructure to major cloud-based vendors (AWS, Azure, etc), and how that has impacted our preservation practices. Out of this discussion, we formed a sub-group to develop a survey on cloud-based infrastructure practices across the membership, which included members of the Infrastructure Interest Group. With the release of the NDSA Storage Survey, we will resume work on the Cloud Services sub-group in 2025, with a follow-up discussion scheduled for our first meeting S&P IG on January 13, 2025.

Working Groups

Communication and Publications Working Group

The Communications and Publications Working Group (CAPs) worked with the Coordinating Committee and chairs of Interest and Working Groups to update internal documentation and website content and publish blog posts and reports.  CAPs works with survey Working Groups to edit and publish the reports, sometimes working on statistical analysis quality assurance. This year CAPs developed additional guidelines around accessibility for report preparation.

Climate Watch Working Group

The Climate Watch Working Group had a productive year establishing our workflows, clarifying our publication criteria and objectives, and setting up our publication platform. We hope to release our inaugural publication early in 2025, so keep an eye out for announcements! 

Events Strategy Working Group

Beginning in Spring 2024, the Events Strategy Working Group (ESWG) focused on a framework for operationalizing recommendations from a previous planning group. They held monthly meetings, with breakout discussions that focused on working groups working on plans for a National Conference and Designated Communities. Despite some uncertainties about NDSA’s organizational affiliation, ESWG plans to deliver three key items by April 2025: (1) a charge for a standing Events Steering Committee to manage NDSA’s overall events strategy and serve as a liaison to annual conference committees; (2) an annual meeting toolkit with recommendations for both in-person and online events (with in-person events to resume in 2027 and coincide with NDSA Excellence Awards), and an action plan to encourage local and regional communities of practice with affiliated organizations and institutions. This action plan will define a process for developing an experts list and a speaker’s bureau to support digital preservation activities and workshops as well as a mechanism for endorsing digital preservation panels at affiliated events.

Excellence Awards

In 2024, the Excellence Awards Working Group utilized the year without an awards cycle to promote EAWG through blogs and video clips. Blogs were published via SAA’s bloggERS and the NDSA blog. Video clips were uploaded to the NDSA YouTube channel, and an Excellence Awards playlist was created to group them. Further blogs and clips are scheduled into 2025.

In addition, EAWG co-chairs drafted an Overview and Guidelines for the EAWG, which has been reviewed by the Communications and Publications Working Group. The EAWG projects finalizing this document in 2025. Finally, Jessica Venlet accepted the position of EAWG Co-chair for 2025-2027.

Levels of Digital Preservation

This year a new Levels Revision Working Group was formed with a remit to carry out a focused review of the Levels looking specifically at the environmental impact of the Levels. Look out for further news on this work in 2025.

This year we have run four Open Sessions: in January we held a general Q&A on the Levels, in April we focused on the Curation Guide, in July the topic was the Assessment Tool and October’s session provided a general introduction to the Levels aimed at those who hadn’t used them before. The Levels Steering Group also gave a presentation on Levels at the Virtual DigiPres conference.

We have seen a few changes on the Levels Steering Group, welcoming new members Rebecca Fraimow, Elizabeth La Beaud and Keith Pendergrass. Karen Cariani left the Steering Group to focus on other priorities and we thank her for all her hard work.

Membership Working Group

The Membership Working Group worked on a new process for onboarding new members to NDSA, and produced a comprehensive report with six detailed proposals designed to increase engagement of members. One of these proposals focused on having a standing Membership Working Group, who would provide onboarding and ongoing membership support. This group will launch in January 2025.

Storage Survey

The working group published the 2023 storage infrastructure report in October, along with anonymized survey results, the survey codebook, and a crosswalk between the 2019 and 2023 survey questions. Working group members presented the survey results at iPRES, DLF, and SAA’s Research Forum.

The post NDSA Interest and Working Groups 2024 Year in Review appeared first on DLF.

by Carol Kussmann at January 13, 2025 02:48 PM

January 09, 2025

Peter Murray

Issue 102: Electricity Infrastructure

I'm about halfway through Saul Griffith's 2021 Electrify: An Optimist's Playbook for Our Clean Energy Future, and I find the author makes a compelling point about bringing nearly everything—energy creation, transmission, and use—to a common factor of "electricity" and then optimizing that system. There are many interesting problems to solve, but they seem solvable.

In last week's Thursday Threads, I touched on how data centers impact the electrical grid. This week's issue looks further into how electricity is generated and distributed. The first article reflects back on the data center topic—it could have just as easily gone in last week's issue. Then there are a few other articles on the generation, storage, the flip away from carbon-based fuels, and a look at history.

This week:

Those are in addition to last week's:

Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page. If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.

Commercial Electricians in Demand for Data Center Construction

These traveling electricians are transforming the sagebrush here in central Washington, with substations going up on orchards and farmland. Hundreds have come to a triangle of counties tied together by hydropower dams along the Columbia River. They are chasing overtime and bonuses, working 60-hour weeks that can allow them to make as much as $2,800 a week after taxes. For all the hype over $100,000 chips and million-dollar engineers, the billions pouring into the infrastructure of A.I. is being built by former morticians, retired pro football linebackers, single moms, two dudes described as Gandalf in overalls, onetime bouncers and a roving legend known as Big Job Bob.
A.I., the Electricians and the Boom Towns of Central Washington, New York Times, 25-Dec-2024

The New York Times publishes this in-depth piece about the boom time for commercial electricians (or anyone who wants to train to become one). Data centers require substantial electrical power to support the high computing needs of artificial intelligence and the storage to save your New Year's Eve photos (as well as the power to run the cooling systems for those computers). Although AI has propelled the construction of data centers to a sharper slope, significant building and expansion projects were already underway. This article is a view at the intersection of traditional construction/labor, technology, land use, and economic growth.

Instability Cause By Overgeneration of Rooftop Solar

[The Australian Energy Market Operator] said the ever growing output from solar was posing an increasing threat to the safety and security of the grid because it was pushing out all other forms of generation that were needed to help keep the system stable. And it warned that unless it had the power to reduce — or curtail — the amount of rooftop solar times, more drastic and damaging measures would need to be taken. These could include increasing the voltage levels in parts of the poles-and-wires network to "deliberately" trip or curtail small-scale solar in some areas. An even more dramatic step would be to "shed" or dump parts of the poles-and-wires network feeding big amounts of excess solar into the grid.
AEMO wants emergency powers to switch off solar in every state amid fears of 'system collapse', Australian Broadcasting Company, 1-Dec-2024

Electricity is unique in that the providers must exactly match the demand at every moment. Excess generation capacity must be removed from the grid...it is just as bad as too little electricity. (Storage of excess electricity is a topic all its own; see below.) In Australia, the rapid growth of solar power generation is making it difficult for the grid operator to achieve that balance. Rooftop solar is great, but having that energy dumped uncontrolled back onto the grid causes instability.

(That isn't the only problem on the grid...there are devices that, as Grady of Practical Engineering says, "force the grid to produce power and move it through the system, even though they aren’t even consuming it.")

Generating Power from the Tides

Solar energy is the bedrock of most renewable energy grid plans – but lunar energy is even more predictable, and a number of different companies are working to commercialize energy generated from the regular inflows and outflows of the tides. One we&aposve completely missed is Minesto, which is taking a very different and remarkably dynamic approach compared to most. Where devices like Orbital&aposs O2 tidal turbine more or less just sit there in the water harvesting energy from tidal currents, Minesto&aposs Dragon series are anchored to the sea bed, and fly around like kites, treating the currents like wind.
28-ton, 1.2-megawatt tidal kite is now exporting power to the grid, New Atlas, 11-Feb-2024

The problem with variable sources like solar and wind is the need for a baseline supply of always-there electricity. Coal, natural gas, and nuclear are good at meeting that baseline power need. Tidal systems are a clean, constant source of energy as well.

Smarter Grid Reduces Demand As Required

On the morning of April 3, Taiwan was hit by a 7.4 magnitude earthquake. Seconds later, hundreds of battery-swap stations in Taiwan sensed something else: the power frequency of the electric grid took a sudden drop, a signal that some power plants had been disconnected in the disaster. The grid was now struggling to meet energy demand. These stations, built by the Taiwanese company Gogoro for electric-powered two-wheeled vehicles like scooters, mopeds, and bikes, reacted immediately. According to numbers provided by the company, 590 Gogoro battery-swap locations ... stopped drawing electricity from the grid, lowering local demand by a total six megawatts—enough to power thousands of homes. It took 12 minutes for the grid to recover, and the battery-swap stations then resumed normal operation.
How battery-swap networks are preventing emergency blackouts, MIT Technology Review, 11-Jun-2024

In addition to managing the supply, there also needs to be advancements in managing the demand side. Businesses already do this...their flexibility to reduce their electricity usage during high-demand events results in cheaper electricity rates because the utility doesn't need to build as much capacity just-in-case. This kind of variable pricing is also available to some homeowners. However, technology on the grid can help support this as well. This article talks about a scooter battery charging company that automatically takes equipment offline when generation capacity unexpectedly drops. Imagine this same sort of grid intelligence available for e-vehicle charging stations as well.

Storing Energy in Mine Shafts

One of Europe’s deepest mines is being transformed into an underground energy store. It will use gravity to retain excess power for when it is needed. The remote Finnish community of Pyhäjärvi is 450 kilometres north of Helsinki. Its more than 1,400-metre-deep zinc and copper Pyhäsalmi mine was decommissioned but is now being given a new lease of life by Scotland-based company Gravitricity. The firm has developed an energy storage system that raises and lowers weights, offering what it says are “some of the best characteristics of lithium-ion batteries and pumped hydro storage”.
This disused mine in Finland is being turned into a gravity battery to store renewable energy, Euro News, 6-Feb-2024

Solar panels only produce power when the sun is out, and wind turbines only produce power when the wind blows. We will need a way to store energy during times of overproduction and send it out to the grid when demand requires it. Many technologies are being explored to use excess energy to pump water uphill or spin a heavy flywheel. The technique in this article raises weights in a deep mine shaft to store energy.

Storing Energy as Compressed Air

Toronto-based Hydrostor Inc. is one of the businesses developing long-duration energy storage that has moved beyond lab scale and is now focusing on building big things. The company makes systems that store energy underground in the form of compressed air, which can be released to produce electricity for eight hours or longer.
Hydrostor Inc., a leader in compressed air energy storage, aims to break ground on its first large plant by the end of this year, Inside Climate News, 4-May-2024

Another potential storage solution is compressed air. All of these systems have trade-offs of expense versus capacity versus location requirements and other factors. Some of these experiments will succeed, and some won't be commercially viable.

The Last Coal-fired Powerplant in Hawaii is Replaced

Hawaii shut down its last coal plant on September 1, 2022, eliminating 180 megawatts of fossil-fueled baseload power from the grid on Oahu — a crucial step in the state’s first-in-the-nation commitment to cease burning fossil fuels for electricity by 2045. But the move posed a question that’s becoming increasingly urgent as clean energy surges across the United States: How do you maintain a reliable grid while switching from familiar fossil plants to a portfolio of small and large renewables that run off the vagaries of the weather? Now Hawaii has an answer: It’s a gigantic battery, unlike the gigantic batteries that have been built before.
A huge battery has replaced Hawaii's last coal plant, Canary Media, 10-Jan-2024

With new generation and storage technologies, where does that leave the traditional burning-carbon-based tools? Fortunately, not long for this world.

The Rise of Renewables

Wind and solar generated more power than coal through the first seven months of the year, federal data shows, in a first for renewable resources. The milestone had been long expected due to a steady stream of coal plant retirements and the rapid growth of wind and solar. Last year, wind and solar outpaced coal through May before the fossil fuel eventually overtook the pair when power demand surged in the summer. But the most recent statistics showed why wind and solar are on track in 2024 to exceed coal generation for an entire calendar year — with the renewable resources maintaining their lead through the heat of July.
U.S. Wind and Solar Are on Track to Overtake Coal This Year, Scientific American, 13-Aug-2024

It would seem that the momentum away from burning carbon fuels is well established. I hope it is established enough to deal with the instability that could be caused by the incoming U.S. federal administration.

This Clock Made Power Grids Possible

On 23 October 1916, an engineer named Henry E. Warren quietly revolutionized power transmission by installing an electric clock in the L Street generating station of Boston’s Edison Electric Illuminating Co. This master station clock kept a very particular type of time: It used a synchronous self-starting motor in conjunction with a pendulum to help maintain the station’s AC electricity at a steady 60-cycle-per-second frequency. As more power stations adopted the clocks, the frequency regulation allowed them to share electricity and create an interconnected power grid.
This Clock Made Power Grids Possible, IEEE Spectrum, 28-Feb-2024

Before there was a grid, there were many isolated islands of power generation. The "alternating" part of "alternating current" meant that these islands couldn't be connected until the cycles of alternation could be synchronized. We take 60-cycles-per-second for granted now, but it wasn't always this way.

From Energy to No Energy

Cat with black spots lounging on a person's lap near a keyboard and mouse, creating a cozy workspace environment.

This has become Alan's routine in the morning. It is far too cold—and now far too snowy—to work outside on the patio. So Alan sleeps through the long winter days on my keyboard numeric pad until spring.

by Peter Murray at January 09, 2025 05:00 AM

Library Tech Talk (U of Michigan)

Accessibility remediation in action on our library website

A table with only 2 rows but the screen reader says “entering 420 results table” instead.
Image Caption

An accessibility violation found in our Staff Directory during baseline testing for the library website.

As part of a broader product accessibility initiative in Library Information Technology, the team behind the library’s website undertook a number of remediation efforts based on the findings of the site’s baseline accessibility evaluation. The work demonstrates how accessibility remediation can also be an opportunity for code clean-up, usability improvements, and refreshing design elements.

by Heidi Burkhardt at January 09, 2025 12:00 AM

January 08, 2025

In the Library, With the Lead Pipe

Cripping Conferences: An Autoethnographic Exploration of Disability in Academia

In Brief This paper employs autoethnography to expose the conference experiences of disabled scholars within the academic and library fields, highlighting the systemic barriers found in these professional settings. In integrating personal narratives with theoretical insights, this study highlights how rigid conference spaces and norms do not accommodate disabled bodyminds, which hinders professional development, and highlights the need for systemic changes. The barriers found include the mental load of navigating inaccessible spaces, the extra financial costs required for participation, and interference between the clock time of conferences and the crip time of disabled bodyminds. While conferences provide challenges, we also find them to be a place for crip connections, as seen in the authors’ friendship that provides both emotional and professional support. This paper concludes with theoretical implications for making conferences more accessible and the disentanglement of libraries and academia from their ideas of productivity and ideal workers.

By Rhys Dreeszen Bowman and Leah T. Dudak

Introduction

Positionality

Rhys Dreeszen Bowman (they/them) is a white, queer, nonbinary Ph.D. candidate at the University of South Carolina. They are a former high school librarian in rural New England. They have a middle-class family background and are also privileged in their whiteness, access to education, and status as a citizen. Rhys is physically disabled and chronically and mentally ill. Rhys is sometimes invisibly disabled and sometimes uses a mobility aid. They often can conceal their disability and pass as nondisabled, which reduces the ableism they experience. Passing as nondisabled also affects their ability to meet their accessibility needs and navigate the world.

Leah T. Dudak (she/her) is a white, cis, female, fat, librarian, and Ph.D. student at Syracuse University. She is disabled with diagnoses of fibromyalgia, dysgraphia, anxiety, and depression. She sees her body as disabled, but her disabilities are often invisible, so people cannot outwardly see them unless she uses a mobility aid or discloses her disability. Meanwhile, her body also carries immense privilege of sometimes passing as nondisabled, as well as the privileges of race, sexuality/gender, education, and socioeconomic status.

Note on language

We use the term bodymind because we consider the physical body and mind inseparable and to act in concert (Price, 2011). We resist the Western assumption that the body and mind are distinct and the privileging of the mind over the body (Clare, 2017). We also use the term crip to move our conversation outside the insular walls of academia and link our work to disability justice activists and the tangible lives of disabled people, moving beyond a disability rights movement that is primarily concerned with helping white men integrate into mainstream society as productive citizens (Hamraie, 2017). We align ourselves instead with the disability justice revolution, built by queer and trans Black, Indigenous, people of color (BIPOC) activists, that fights for the liberation of all sick/unwell/mad/neurodivergent/crip bodyminds.

Literature Review

We add to the tradition of disability scholars attuning to the role of disability and resistance in academia. In Activist Affordances, Dokumaci (2023) uses visual ethnography to chronicle the lives of invisibly disabled people related to arthritis and other inflammatory diseases. Dokumaci attunes to what she calls “unnoticed choreographies,” or the everyday actions disabled people take to move through the world (p. 2). The author calls activist affordances the “performative microacts/arts through which disabled people enact and bring into being the worlds that are not already available to them, the worlds they need and which to dwell in” (pp. 2-3). She posits that disabled futures already exist and activist affordances are “outposts” of these worlds. We draw on Dokumaci’s work to frame our disabled resistance within academia and the possibilities we find for disabled futurities.

In The Undercommons: Fugitive Planning & Black Study, Harney and Moten (2013) examine how institutions such as the university impede our ability to empathize and our capacity to love. They call the undercommons a “maroon community” of teachers and students that “refuse to ask for recognition and instead want to take apart, dismantle, tear down the structure that, right now, limits our ability to find each other, to see beyond it and to access the places we know lie beyond its walls” (p. 6). They argue that the university is designed to uphold capitalism, and the university creates a free labor source to benefit the State. We borrow from Harney and Moten’s understanding of the university as an agent of capitalism, extending their argument to the way the neoliberal values of academia limit and harm disabled academics.

Additionally, this work is adding to already existing literature around the inaccessibility of conferences such as Manwiller’s (2019) article highlighting how conferences can often be the hardest part of being an academic and harder than any other professional experience. In her follow-up critique of the 2021 ACRL conference, she writes: “At the time, I thought it was my responsibility to adapt to the conference setting if I wanted to be professionally active” (2021). Finally, she questions why library organizations act and conduct business as if disabled workers do not exist. Below, we expand upon Manwiller’s articles and discuss this tension between participating in a conference and creating our own space, but also demanding to be accommodated in the rest of this work.

Price (2011) challenges the academic assumption that disability affects the body while the mind remains untouched by illness. She pushes understandings of disability and education practices to include mental disability. Price challenges readers to reconsider ableist academic values such as productivity and independence and how they damage disabled people. Andersen (2024) uses autoethnography to interrogate how disability impacts the author’s experience as a librarian. Anderson notes how writing in the field of Library and Information Science (LIS) focuses on how libraries can serve patrons with disabilities, ignoring the possibility that librarians could themselves be disabled.

Theoretical framework

Autoethnography as a method

Autoethnography empowers us to share our experience and, in doing so, speak back to the silence that frames illness as private and unspeakable, which allows us to open a conversation about what it means to participate in librarianship and academia in bodyminds that are devalued and excluded. We argue that autoethnography is a crip methodology. When one’s own bodymind becomes the site of research, there is no need to travel or schedule to conduct interviews, which can tax the body and ignores our need for sudden rest. Conducting research on ourselves allows us to research as it suits our bodies. We work when we are able and rest when we need to. Crip autoethnography also values disabled stories as knowledge that is worthy of study (Richards, 2008; Kasnitz, 2020). We position ourselves as both subjects and objects, and through storytelling construct knowledge (Ellis, 2004). In this, we are not creating universal truths but sharing just as valid individual ones, which can still create disruption and push change.

We use autoethnography to connect the personal and political and “show how stories become the change we want to see in the world” (Holman Jones & Harris, 2019). Our research addresses the injustice within librarianship, academia, and within our own research (Madison, 2012). By analyzing our stories, we advance knowledge and offer possibilities for new practices. Using Kafer (2013) as a model, we tie crip autoethnography to queer autoethnography to focus on subjugated knowledges. Queering autoethnography speaks truth to power and in “that speaking enacts new worlds. Not just records them” (Holman Jones & Harris, 2019, p. 64).

Additionally, oftentimes, disability and chronic illness are written about from a medicalized point of view to inform caregivers or medical professionals. However, in this, the voice of the patient/client/disabled individual is often ignored (Piepzna-Samarasinha, 2018). As such, disabled bodies become something that is worked on, rather than an active participant in care. Autoethnography allows us to say in our own words our experiences, needs, wants, and desires without our stories then being filtered through a medicalized lens. This filtering erases our voice, but through autoethnography, we fully claim it, appreciate ourselves as experts, and analyze it in our own terms (Kasnitz, 2020; Richards, 2008). While many critics of autoethnography may see this closeness as a flaw, we see it as a reclamation of voice and story.

We also acknowledge that there is no one disabled experience, which is why we are writing together in some sections (such as this one) and also separately, even though we are both academics who share a discipline. We are both insiders and outsiders to each other (Richards, 2008). In using autoethnography, we give voice to our similarities and differences, which allows us to open a world of possibilities.

Crip time

We frame our experience of conference attendance through the lens of crip time, popularized by Alison Kafer (2013) in her book Feminist Queer Crip, who uses crip time to discuss the way time operates differently for queer and disabled people. Disabled people move at a slower (or faster) space than normative society and might need more time to complete tasks or may be late to meetings due to navigating an inaccessible physical world. It is in direct conflict with normative[1] or clock time. We lose time to doctor visits, surgeries, and days spent in bed. Simple acts often result in the need for time to rest and recover. The adage that everyone has the same 24 hours crumbles upon an examination of disabled relationships with time. Kuppers calls crip time a “temporal shifting” (2014, paragraph 2), a way we slip out of place with clock time or the time at which normative society functions. Clock time is the time zone of neoliberal capitalism, while crip time belongs to the land of the ill. Samuels (2017) writes that

crip time is time travel. Disability and illness have the power to extract us from linear, progressive time with its normative life stages and cast us into a wormhole of backward and forward acceleration, jerky stops and starts, tedious intervals and abrupt endings (paragraph 5).

Crip time pushes disabled people out of time. The challenge and impossibility of operating on clock time is invisibilized and often overlooked when considering accessibility. Kafer suggests, “rather than bend disabled bodies and minds to meet the clock, crip time bends the clock to meet disabled bodies and minds” (2013, p. 27). Crip time is then a crucial feature of disabled experiences and must be considered when creating spaces welcoming of all bodyminds.

Chronopolitics is another way of understanding time as intrinsically tied to political behavior. Since bodies are inherently political, especially disabled ones, how disabled bodies move in time becomes a political behavior. The modern concept of time as we understand it was “discovered” in the fifteenth century, partly in order to frame new modern ideas of societal and individual progress. This conception of time led to ideas of human destiny, which motivated colonial projects across Europe and The United States (Toulmin & Goodfield, 1965). Time itself is socially constructed through political processes (Becker, 2019). Chronopolitics creates and maintains clock time. Within a university setting, Zembylas (2023) argues that chronopolitics is the “affective milieus”—how everyday interactions between individuals and their surroundings are imbued with power dynamics–within institutions of higher learning, as he:

…draws on existing studies in neoliberal academia to argue that changing academics’ affective habits created by dominant time discourses and practices requires the disruption of affective milieus in which time is channeled, routed and molded (2023 p. 493).

Zembylas proposes the liberatory potential of disrupting the chronopolitics of neoliberal academia. Within academia, chronopolitics functions as a hyperfocus on speed, efficiency, productivity, and manageability. Chronopolitics “…refers to the politics of time governing academic knowledge generation, epistemic entities, and academic lives and careers, as well as academic management processes more broadly speaking” (Felt, 2017, p. 54). In their autoethnography, Isaacs (2020) discusses the harm caused when disabled bodies, specifically bodies who stutter or do not speak efficiently, come into conflict with chronopolitics. These bodies are punished for their slowness and inability to meet the fast pace necessitated by chronopolitics. These functions of chronopolitics within academia align with neoliberal goals. Chronopolitics is the clock time that disabled bodyminds fail to adhere to, whose authority crip time pushes back against. Functioning within higher education, conferences are impacted by chronopolitics. When we are late to sessions because of our disability, we are punished for our lateness. The pressure to attend conferences to advance our careers is caused by the need to adhere to this academic time.

The neoliberalism of librarianship and academia

Many scholars have argued that academia is rooted in a tyrannical neoliberalism from which we must liberate ourselves for a more just and inclusive culture. The same neoliberal values can be found in librarianship and library work (Brady, 2023). Neoliberalism’s conception of the ideal worker is defined by the rigorous expectations for publishing and near-constant production.

This notion of the ideal worker sets the standard for accountability and performance in neoliberal higher education [and librarianship]. For example, performance-based research funding and reward systems now determine the kind of research projects, academic units, and professional behaviors that are valued (Vázquez & Levin, 2022, paragraph 3).

The endless grind of both librarianship and academia in the battle for higher prestige, doing more with less, constant production, and what Beretz calls the “academic culture of heroic stamina” (2003, p. 52), which ignores the mental and physical consequences, especially for marginalized folks that already face additional barriers to their participation. Vázquez and Levin (2022) argue that neoliberalism results in symbolic violence stemming from the “managerial practices” of neoliberalism. We use crip theory as a framework and materialist critique to examine the ways both librarianship and academia, which are steeped in neoliberal values of productivity, composes disabled bodies. Despite moves in librarianship and academia to be more inclusive and focus on diverse hiring, a perception remains that physically disabled people are unable to produce adequately, and production is the ultimate neoliberal goal. Even when disabled academics are able to conform to academia’s rigorous expectations for production, they are still punished or experience adverse long-term effects on productivity, morale, job satisfaction, physical or mental health (Pionke, 2019). Tenure and advancement within academia are not based solely on merit but on personality politics of who is seen to “fit” within academic culture, which often excludes any disabled people. Neoliberalism is firmly rooted in white supremacy. We acknowledge that with our whiteness, we are complicit in neoliberalism and the harm it causes.​​

Discussion and findings

Mental Load

Mental load is often used to describe the invisible mental work that is done by women in the household to continually keep things running, involving things like planning, anticipating, and organizing (Emma, 2018). We take this concept of invisible mental work and apply it to disability, highlighting the extra planning, considerations, and thoughts that disabled people have to navigate a conference setting. For example,

Leah

Pre conference:

Conference:

Post conference:

The above sketch is just a fraction of the thoughts that continually bombard my consciousness as I am getting ready for, going to, and following a conference. Typically, I am good at silencing racing thoughts, but during a conference these are ongoing. When a space is clearly not created for you, it takes more energy, advocacy, and creativity to navigate.

At a conference, I have the mentality of a marathon runner (something I could never do in real life): I am strategizing when to rest, when to spend, and those calculations need to be made ahead of time. I do not have the luxury of listening to my body in the moment and adjusting. If my body’s pain starts to get worse, my calculations are wrong and it is already too late. I am masking the inner turmoil these decisions cause, because I know if others could see my racing brain they would be uncomfortable, so instead I hold this discomfort for them. With my mental calculations, I try to predict the future, for my body, career, and connections with others. Ultimately, it is a zero-sum game, even when I win in one of those three areas, I lose in others.

Even if the chairs are too small and add more pain to my already pain riddled body, I do not have the luxury of opting out. If I want a career in academia, if I want people to listen to and respect my voice, if I want to be able to continue to pay for the things that keep my body moving such as medication, physical therapy, massage, and more, conferences are required regardless of the toll and damage it does. So I push. Having this mental burden of continually making these decisions makes me feel isolated and distinctly other. This feeling of otherness and the reluctance from academia and librarianship to embrace and accept folks with disabilities (Pionke, 2023) is why I force my way in and make space for myself and others like me. As Beretz highlights, “Illness and injury, after all, are inescapable realities of human life” (2003, p. 51). Since illness is a part of life, I reject the medicalized view that disabled folks are damaged, and demand acceptance for my body now, especially since it will never be healed (Isaacs 2020). We have every right to be able to pursue our passions, ideas, curiosities, and voice just like our nondisabled peers. And I will claim that space for myself, and others, even if it exhausts me to my core. Accessibility is not the responsibility of the individual; it is the responsibility of the system (Manwiller 2021).

Disability Tax

The concept of a disability tax refers to the fact that it is often more expensive to move through the world in a disabled body (Olsen et al., 2022). In our disabled bodies, conference attendance is significantly more expensive for us than it is for many of our peers.

Rhys

When I travel through the airport, I need to tip the workers who push my wheelchair. I sometimes travel with an assistive device, which is cumbersome and expensive. The difficulty of moving through the airport means I need to pay to check a bag, as it would be too taxing to navigate me, my rollator, and a carry-on bag. I am denied the privilege of a free carry-on bag if I decide to travel with my assistive device. I cannot take public transportation while carrying a heavy bag and, therefore, need to take a taxi or Uber to and from the airport. I need to stay at the conference hotel to rest between sessions and reduce commute time to the conference. It is hard for me to share a room with peers, as during conferences, I often fall asleep by 7 p.m. to rest up for the next day. Many of my friends will stay up to forty minutes away from the conference in a cheaper area, as a group, and take public transportation to attend the conference. None of these cost-saving measures are possible for me. I need to be able to rest between sessions. I also must arrive the day before the conference and pay for that extra night. It is impossible for me to take an early morning flight and attend conference sessions on the same day. Paying for an extra night before and after the conference is challenging on a graduate student’s budget. Attending conferences as a disabled person is costly for me, both the cost to my health from exerting myself and the financial cost of funding a conference trip in a disabled body.

Despite these challenges, I continue to attend conferences because they are required to succeed as an academic. I need a venue to share my work, to network with colleagues, and yes, get lines for my CV. Some days, it all feels like too much, and I wonder if academia is the best place for me. But I feel a stubbornness and a refusal to be forced out. I know that my scholarly work matters and that our field needs people who are pushing back in the ways I am. If I were to give up, that would be one less disabled voice holding academia to task and pushing for a more inclusive and accessible culture. And also, I do this work because I can. While there is a financial and physical toll on me, I am able to succeed in academia despite what it costs me. I have periods of relative health when I am able to work the long hours required for this field. I have the kind of brain that allows me to work on projects weeks ahead of their deadlines so that even if I need to take time off when I’m ill, I can mostly complete my work on time. Most of the time, unless I am using my mobility aid, I am invisibly disabled. While there is pain that comes with not being seen, there is also a privilege in being seen as nondisabled and receiving the accompanying advancements. I am white and do not experience additional barriers in academia because of my race.

It is likely that while I experience some disadvantages because of my disability, gender, and queerness, I mostly benefit from academia’s exclusionary politics. I do this work for those who can’t or won’t, so disabled voices are still heard in academia. I do this work because the cost I pay is so much less than the cost many people experience. And also because I love it. I love the challenge and living in a world of ideas. I love that I spend my days writing and thinking, and I love that I can do this work with my friends. So even when it’s hard, and even when there are days I want to give up, I know I won’t. I’m in it for the long haul.

Crip time

Rhys

After conferences, I often spend up to a week in bed recovering and am unable to work. I become anxious about missing deadlines, and I worry I won’t ever feel any better. It is hard to explain to my professors and peers why I need so much rest after conferences, when they are able to return to work the day after traveling back from a conference. I feel frustrated by the toll the conferences take on me and my need to rest. Even those who know about my disability don’t understand what it’s like to exist in my body. It is hard to take the break I need in an academic culture where there is no expectation that rest might be needed. I find myself doubting my chosen career and wondering if I can make it through another conference. Kafer writes, “we are all to be smoothly running engines and disability renders us defective products” (2013, p. 54). When I can’t get out of bed for a week and deadlines pile up, I am made to feel like a defective product, a far cry from academia’s ideal worker.

This conflict between my body and the expectations of my profession can be explained through a consideration of crip time and chronopolitics. Conferences, libraries, and larger academic institutions have a hyper-focus on clock time, governed by chronopolitics. The conference clock time is an impossibility in the crip time chronically ill people exist in. Sleeping in and going to bed early to preserve energy forces us to miss sessions and opportunities. Waiting for the one accessible bathroom or the time it takes to navigate large conference halls makes us late to sessions. We are seen as failing when we cannot conform to the demands of conference tine. And after the conference is over, the conflict between clock time and crip time only worsens. The penalties of conference attendance and forced conformity to clock time force us into bed and even further out of time. In the autoethnography of her stroke, Jane Speedy (2015) writes about the way time collapses when one is ill and in bed. For Speedy, illness calls into question one’s “situatedness” (p. 37). Time slows to a snail’s pace as we spend an hour staring at the wall or the inside of our eyelids. And time becomes fast, jumping forward when we sleep for hours and wake to find the sky already dark. Kafer (2013) writes that illness and disability impact how one experiences time: “Not only might they cause time to slow, or to be experienced in quick bursts, they can lead to feelings of asynchronicity or temporal dissonance” (p. 34). This disabled experience of time forces a departure from clock time or what Kafer calls straight time with its insistence on “firm delineation between past/present/future” (p. 34).

Crip connections

Carework is an idea coined by disabled writer and activist Leah Lakshmi Piepzna-Samarasinha (2018) to frame the work done by and within disabled communities of color as acts of love rather than chores or obligations. Distancing themself from traditional ideas of caring for disabled people as a burden, Piepzna-Samarasinha argues that caring for one another is a way to build power and foster communities where no one is left behind. We draw from the work of Black and brown queer femmes who centers the importance of care and connection to use the idea of carework to offer new possibilities for both librarianship and academia, freed from neoliberal preoccupations with productivity that exclude disabled people. By making crip connections, we can create care networks that oppose typical productivity, embrace rest, and lead with care (Hersey, 2022). We offer care not because we feel obligated but because it is a radical way we can show our love for one another.

Rhys and Leah

While conferences have many consequences for chronically ill people, they also offer moments of connection. We the authors, Leah and Rhys, met at a conference in 2022, and the friendship that has blossomed offers us both support and solidarity to persist in an ableist and often hostile climate. Crip community makes withstanding the hurdles of conference attendance more possible as it offers togetherness and knowledge that one is not alone. Living across the country from one another, we only see each other in person when we travel to the same conference. While conferences are hard on our body, the opportunity for us to connect is a lifeline for us within academia. Upon our first meeting, we processed our grief and rage over a lack of COVID precautions over text.

Leah: The last thing I wanted to say is how this conference has COVID policy in place, but they could submit a negative test like two weeks ago? How does that help? Okay, thank you for letting me get this out lol.

Rhys: It’s so ridiculous. People want a checkbox that they’ve done the right thing but are unwilling to take steps to keep people safe. Which I know is the same with literally everything else in our society but it’s heartbreaking watching it unfold when it could be different.

Leah: Agreed. And when it’s something so small as a mask.

This conference was held in 2022. By 2024, almost all conferences have dropped all pretense of COVID precautions despite the reality that at the time of this writing, COVID is still a concern for us as disabled people: we don’t need to get another disability like Long COVID; this is not Pokémon, we do not need to catch them all. Our friendship allows us space to express our deep rage and feel less isolated in our experience. Malatino writes of an “infrapolitical ethics of care” which he calls “a reliance on a community of friends to protect and defend one from violence, to witness and mirror each other’s rage, in empathy, and to support one another during and after the breaking that accompanies rage” (2022, p. 118). We offer each other the refuge to express and mirror our anger.

Participating in conferences together also offers the validation of the experiences and the struggles that accompany attendance, which makes conferences a bit more bearable. It is difficult to explain to nondisabled people what it’s like to exist in our bodies and how impossible it can feel. Rhys was encouraged by a professor to attend a doctoral poster session held after a long day of sessions. Rhys made it out of their hotel room and to the event hall but was immediately overwhelmed by the mass of unmasked people, the noise, and the lack of chairs and retreated to bed. They texted Leah, “I was feeling too shitty to go to the posters, so I ate pizza in bed.” Leah validated this decision to prioritize their health and rest in bed rather than pushing through, which would lead to an even bigger collapse later on. As we parted ways at the end of the conference, we both expressed our joy at meeting one another and promised to stay in touch.

Leah: Please stay in touch, Always happy for rants, successes, and just general life being a disabled grad student. I’m really glad we connected.

Rhys: I am too! You’re my first disabled PhD person so I will definitely take you up on that.

In the two years that followed, Leah and Rhys text each other regularly for support and to celebrate accomplishments. And our friendship has evolved into a professional partnership with the writing of this and other autoethnographies examining our experiences. Our friendship offers support that is unavailable to us through traditional socialization within the university. Celebrating our successes together allows us to create an independent rewards system to sustain our energy in a culture where rewards are withheld and often portioned out in racist, sexist, heterosexist, and ableist ways (Museus & LePeau, 2020).

Within our friendship, we are also creating a small culture of care in academia and librarianship. These systems of power do not love us back, we all need and give care along with the time, support, and resources to care well (Segal, 2023). By caring for each other, and embracing crip time, we create pockets of space and time for care where none exists. We reject the neoliberalism of capitalism together by caring for and validating each other. And this is why we continue to do the work and go to conferences. When we go to conferences and bring carework with us, we are disrupting both the conference and academic systems by creating belonging where there was none. We also go to conferences because it is when we are able to see each other and maintain our relationship which has proven to be an important lifeline in both our personal and professional lives. We also talk about things outside of our academic experience, expanding our two person care network outward and further into the liminal crip time. Capitalistic systems are not built with care, and by creating systems of care, we further push against capitalistic expectations of librarianship and academia (Hersey, 2022; Segal, 2023; Kafai, 2021).

Conclusion and implications

Theoretical implications

 This article is just one small conversation within a larger discussion of how to disentangle libraries and academia from their insistence on production and value of the ideal worker. While making conferences more accessible is a place to begin, until the pace of librarianship and academia changes chronically ill folks will face the same ruptures of crip time and clock time that impede our success. In our writing, we turn to the future and what might be by using autoethnography, enabling us to attune to queer and crip futurity. Muñoz (2009) conceptualized queer futurities as “what’s not yet here”; these futures “insist on another time and place that is simultaneously not here yet but also to be glimpsed in our horizon (p. 183). Inspired by Muñoz, we use our stories to reflect on what has been and ponder what could be. In this crip imagining, we speak into being the futures we want for ourselves. We manifest futures where rest is prioritized, where crip knowledge is valued and disabled people share with our well and nondisabled peers the lessons about wellbeing and community care. We aspire to a future where burnout is not glorified, where academics do not compete over who can work the most, where diverse work paces are considered for promotion. We imagine a future for library and information science where crip methodologies are esteemed as rigorous and given space in leading journals.


Acknowledgements

We would like to thank the Internal Peer reviewer – Brea McQueen ; External Peer reviewer – JJ Pionke; and Editorial Board member Jess Schomberg for their kindness and help with this work. We would also like to honor our crip connection as authors for allowing this co-written scholarship and thank our bodyminds for producing this work.


References

Anderson, N. (2024). Chronically honest: An autoethnographic paper on the experiences of a disabled librarian. In the Library with the Lead Pipe. https://www.inthelibrarywiththeleadpipe.org/2024/chronically-honest/

Becker, T. (2019). Chronopolitics: Time of politics, politics of time, politicized time. History and Theory, 62(4), 3-23.  https://www.hsozkult.de/event/id/event-89282   

Beretz, E. M. (2003). Hidden disability and an academic career. Academe, 89(4), 50–55. https://doi.org/10.2307/40252496

Brady, F. (2023). Scaffolded information literacy curriculum: Slow librarianship as a rejection of the hegemony of neoliberalism. Journal of New Librarianship, 8(2), 29–40. https://doi.org/10.33011/newlibs/14/2

Clare, E. (2017). Brilliant imperfection: Grappling with cure. Duke University Press.

Ellis, C. (2004). The ethnographic I: A methodological novel about autoethnography. AltaMira Press.

Hamraie, A. (2017). Building access: Universal design and the politics of disability. University of Minnesota Press.

Hersey, T. (2022). Rest Is resistance: A manifesto. Little, Brown.

Holman Jones, S., & Harris, A. M. (2019). Queering autoethnography. Routledge.

Isaacs, D. (2020). ‘I don’t have time for this’: Stuttering and the politics of university time. Scandinavian Journal of Disability Research, 22(1). https://doi.org/10.16993/sjdr.601

Kafai, S. (2021). Crip kinship: The disability justice & art activism of Sins Invalid. Arsenal Pulp Press.

Kafer, A. (2013). Feminist, Queer, Crip. Indiana University Press.

Kasnitz, D. (2020). The politics of disability performativity: An autoethnography. Current Anthropology, 61(S21). https://www.journals.uchicago.edu/doi/full/10.1086/705782

Kuppers, P. (2014). Crip time. Tikkun, 29(4). https://muse.jhu.edu/article/558118/pdf

Malatino, H. (2018). Tough breaks: Trans rage and the cultivation of resilience. Hypatia, 34(1), 121-140. https://transreads.org/tough-breaks-trans-rage-and-the-cultivation-of-resistance/

Manwiller, K. Q. (2021, May 26). The inaccessibility of ACRL 2021. ACRLog. https://acrlog.org/2021/05/26/the-inaccessibility-of-acrl-2021

Manwiller, K. Q. (2019, October 27). Conferencing while chronically ill. ACRLog. https://acrlog.org/2019/10/27/conferencing-while-chronically-ill

Muñoz, J. E. (2009). Cruising utopia: The then and there of queer futurity. NYU Press.

Museus, S. D., & LePeau, L. A. (2020). Navigating neoliberal organizational cultures implications for higher education leaders advancing social justice agendas. In A. J. Kezar & J. R. Posselt (Eds.), Higher education administration for social justice and equity: Critical perspectives for leadership. Routledge. https://works.bepress.com/samuel_museus/113/

Olsen, S. H., Cork, S., Anders, P., Padrón, R., Peterson, A., Strausser, A., & Jaeger, P. T. (2022). The disability tax and the accessibility tax. Including Disability, 1(51), 51-86. https://ojs.scholarsportal.info/ontariotechu/index.php/id/article/view/170

Piepzna-Samarasinha, L. L. (2018). Care work: Dreaming disability justice. Arsenal Pulp Press.

Pionke, J. (2023). The interview process and people with disabilities. Journal of Library Administration, 63(4), 587–593. https://doi.org/10.1080/01930826.2023.2201724

Pionke, J. J. (2019). The impact of disbelief: On being a library employee with a disability. Library Trends, 67(3), 423–435. https://doi.org/10.1353/lib.2019.0004

Price, M. (2011). Mad at school: Rhetorics of mental eisability and academic life. University of Michigan Press.

Richards, R. (2008). Writing the othered self: Autoethnography and the problem of objectification in writing about illness and disability. Qualitative Health Research, 18(12), 1717-1728. https://doi.org/10.1177/1049732308325866

Samuels, E. (2017). Six ways of looking at crip time. Disability Studies Quarterly, 37(3). https://dsq-sds.org/index.php/dsq/article/view/5824/4684

Segal, L. (2023). Lean on me: A politics of radical care. Verso Books.

Speedy, J. (2015). Staring at the park: A poetic autoethnographic inquiry. Left Coast Press.

Vázquez, E., & Levin, J. (2018). The tyranny of neoliberalism in the American academic profession. Academe: Magazine of the American Association of University Professors. https://www.aaup.org/article/tyranny-neoliberalism-american-academic-profession

Zembylas, M. (2023). Time-as-affect in neoliberal academy: Theorizing chronopolitics as affective milieus in higher education. Studies in Higher Education, 49(3), 493–504. https://doi.org/10.1080/03075079.2023.2240352


[1] Other marginalized groups such as people of color and queer and trans people, and people in the Global South also have complicated relationships with clock time and often fail to operate on the schedule of normative time. By normative time, we mean a time system governed by Western, white, non-disabled, cisgender, and heterosexual society.

by Rhys Dreeszen Bowman at January 08, 2025 01:00 PM

Journal of Web Librarianship

Digital humanities in the library (2nd ed.)

.

by Bradford Lee Eden Independent Scholar and Academic Librarian, Philadelphia, Pennsylvania, USA at January 08, 2025 05:28 AM

Lucidworks

Beyond Basic Search: How to Clean Up Your Ecommerce Search in 2025

Discover how to transform your ecommerce search architecture for 2025. Learn proven strategies for implementing AI-powered search, real-time inventory awareness, and personalization from a search engineer.

The post Beyond Basic Search: How to Clean Up Your Ecommerce Search in 2025 appeared first on Lucidworks.

by Benjamin Holland at January 08, 2025 12:23 AM

January 07, 2025

HangingTogether

Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 7 January 2025

The following post is one in a regular series on issues of Inclusion, Diversity, Equity, and Accessibility, compiled by a team of OCLC contributors.

Silhouettes of flying birds against an orange sky.Photo by Barth Bailey on Unsplash

New IFLA guidelines for serving displaced persons

The International Federation of Library Associations and Institutions (IFLA) recently published the IFLA Guidelines for Libraries Supporting Displaced Persons: Refugees, Migrants, Immigrants, Asylum seekers to provide practical guidance for libraries supporting these groups. The guidelines define displaced persons as “persons who have been forced or obliged to flee or leave their homes or places of habitual residence (whether in their own country or across an international border), in particular as a result of or in order to avoid the effects of armed conflict, situations of generalised violence, and human rights violations or natural/human-made disasters. In the context of these guidelines, we refer as a whole to all these different groups: asylum seekers, immigrants, migrants, and refugees.” The guidelines cover services and programs for users, policies, staff training, and other topics. As these are broadly written for international use, libraries would use the guidelines as a starting point in the formation of their own policies and practices.

There is clearly a need for this type of publication as there were over 117 million people forcibly displaced in 2023. There are many helpful recommendations including offering a working space to humanitarian organizations inside the library and creating pop-up library spots inside refugee camps and asylum centers. I wish the authors had explicitly acknowledged the differences between voluntary migration and displacement which is an involuntary migration caused by horrible conditions in the place people are leaving. While cultural and language differences may impact many immigrants and migrants, those who have been forcibly displaced are more likely to have additional disadvantages and special needs because of their displacement. Libraries are better able to serve displaced persons when staff understand the differences between these migration situations. I am guessing the authors of the publication chose to use the term “displaced persons” because it is not a legal term and many who do not legally qualify as refugees may have fled violence or extreme poverty and suffered terribly. I believe libraries will find this publication useful if they are mindful of these situational differences when reading the guidelines and reviewing resources cited in the bibliography. Contributed by Kate James.

Standing up for US libraries in a new era

All of us who treasure libraries and value the roles they play in a free and democratic society have been wondering how to prepare for the 119th United States Congress and the 47th President of the United States. On 15 January 2025 at 4:30 p.m. Eastern Time, the American Library Association Public Policy and Advocacy Office will offer “Standing Up for Libraries: The Next 100 Days,” a webinar that is free for all ALA members. Although attendance is limited to 1000, a recording of the session will be available to ALA members through 30 January. ALA promises to “offer tangible steps for library advocates moving forward and preview upcoming legislation and litigation that will impact the library field.”

Since 1945, the Public Policy and Advocacy Office has been the voice of libraries speaking to the government of the United States and keeping libraries and their advocates informed about government policies and actions. The office has been instrumental in furthering the interests of libraries and users in the realms of privacy, funding, copyright, government information, education, and related areas. Keeping informed and promoting library values remains as important as ever. Contributed by Jay Weitz.

Binghamton University Libraries honored for work in inclusion, diversity, equity and accessibility

Binghamton University Libraries was honored with the 2024 South Central Regional Library Council Prism Award, which honors library workers or organizations for work in advancing for Diversity, Equity, Inclusion, Justice and Accessibility. This work includes implementing structural changes, actively becoming antiracist or reimagining policies to be inclusive. Binghamton Libraries have sustained several initiatives, ranging from fostering the library as a safe and inclusive space and place to diversifying collections.

Binghamton University was one of a number of OCLC Research Library Partnership institutions we interviewed in order to better understand how research libraries are approaching diversifying collections. It is great to see their work—which has been ongoing for some time—acknowledged in this way. Contributed by Merrilee Proffitt

The post Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 7 January 2025 appeared first on Hanging Together.

by Merrilee Proffitt at January 07, 2025 10:35 PM

Open Knowledge Foundation

Project Respira: Open Innovation for Environmental Justice in Asunción

Air pollution is one of the world’s greatest environmental threats to public health. In Asunción, Paraguay, where pollution levels are increasing year after year, there is an urgent need for initiatives to tackle this problem. Project Respira, funded by the Mozilla Foundation as part of the Open-Source AI for Environmental Justice Award 2024, is presented as the first air quality prediction service for the city of Asunción and its surroundings. This initiative is an example of how free technology, open data and collaboration can provide innovative solutions to critical problems related to climate change and environmental justice.

What is Project Respira?

Project Respira is a pioneering initiative that aims to provide air quality forecasts for the Greater Asunción area. The project was designed to empower citizens by providing not only forecasts, but also health recommendations tailored to the local context, allowing them to make informed decisions to protect their health on days when air quality is compromised. From planning outdoor activities to avoiding exposure during periods of high pollution, this service represents a crucial step towards reducing the health risks associated with pollution through open information and citizen education.

The system includes a web platform and integrated bots in Telegram and X (Twitter) for easy access to the services. Features include daily air quality forecasts, guidance on how to protect your health based on pollution levels, statistics generated from sensors distributed across the city, and access to relevant research. In addition, the project’s open repositories on GitHub allow others to replicate and adapt the system in different contexts.

Open source forecasting in response to climate change impacts

Climate change has exacerbated environmental problems in Paraguay, creating increasingly urgent challenges. Pollution levels have increased in recent years, driven by forest fires, extreme weather events and urban sprawl. One example was in September 2024, when a thick layer of smoke covered the country for weeks as a result of fires in northern Paraguay, Bolivia and Brazil. This phenomenon caused respiratory problems for thousands of people and reignited the debate on the need for tools to anticipate and mitigate the risks associated with pollution.

Against this backdrop, Project Respira emerges as an effective and practical response, developed by the community and for the community. This system not only provides a predictive tool for air quality, but also reflects how citizen organisation and open technology can be used to address the challenges of climate change.

The initiative focuses on the principles of environmental justice, ensuring that benefits reach the most vulnerable. At the same time, it builds the capacity of communities and authorities to respond to environmental emergencies. Phenomena such as forest fires and air pollution, which are becoming increasingly common in South America, underscore the importance of projects like this to protect public health and build resilient communities in the face of climate challenges.

An initiative of civil society, academia and the open source community

Project Respira is the result of an important collaboration between civil society, academia and the open source community in Paraguay. The Faculty of Engineering of the National University of Asunción has been a key pillar, providing not only infrastructure and technical knowledge, but also access to open data essential for the operation of the system. A crucial aspect of this effort has been the active participation of the Paraguayan open source community, especially through the organisation Girls Code Paraguay, an initiative led by women in technology that has been instrumental in the development and implementation of innovative solutions. Their commitment and experience have made the project not only viable, but also replicable and scalable in different contexts.

The design of the system follows open technology principles, meaning that its components are adaptable and can be implemented by other regions, both within Paraguay and in other countries. Although in its early stages the project is focused on the Greater Asunción area, it is designed to provide regional forecasts in the near future, expanding its scope and benefiting more communities.

Open data, artificial intelligence and technological innovation in the context of the Global South

The success of Project Respira is largely due to access to open data and the use of advanced Machine Learning models. Predicting pollution levels requires not only pollutant measurements, but also accurate meteorological data to help model the dynamics of pollutant movement through the city. Pollution level measurements from the PM2.5 Particulate Matter Monitoring Network of the Faculty of Engineering of the National University of Asunción are combined with climate data from Meteostat, a global service that provides historical and current meteorological information.

A second innovative aspect of the project lies in the development of strategies to improve the quality of data from field instruments. A major challenge in collecting pollution and climate data in the Latin American context is the need for regular calibration of measuring instruments, which lose reliability over time due to wear and tear on the instruments. This leads to operating and maintenance costs that are difficult to sustain over time.

Project Respira, in collaboration with researchers from the National University of Asunción, is using an innovative remote calibration system for its low-cost sensors, using data from the US Embassy’s AirNow system in Paraguay. Remote calibration is a solution that significantly improves the reliability of the data collected by the sensors. This solution allows low-cost sensors to be installed at numerous points around the city and remotely calibrated on a regular basis using a single high-end sensor as the standard, significantly reducing operation and maintenance costs. This is critical in contexts where resources are limited, such as in many Latin American cities. The approach is also replicable, allowing other regions to implement similar solutions tailored to their needs.

A model for the future

Although Project Respira is in its early stages and is currently limited to the greater Asunción area, its modular and open design allows for a future in which this service can be expanded throughout Paraguay and eventually replicated in other regions of the world. This approach demonstrates the potential of collaboration between academia, civil society and the open source community as an effective model for addressing complex environmental problems.

The project also highlights the importance of open data in the fight against climate change. Thanks to the availability of high-quality data from a variety of sources, it is possible to build predictive models that are not only effective, but also accessible and easily reproducible. This approach reinforces the need to promote open data policies that facilitate innovation and allow other similar initiatives to develop, adapted to the local specificities of each region.

In a world increasingly affected by climate change, initiatives like Project Breathe play a critical role in protecting public health, empowering communities and building a more resilient future. By combining advanced technology, open data and interdisciplinary collaboration, this project sets a precedent for how environmental challenges can be tackled now and in the future.

by Fernanda Carles at January 07, 2025 04:53 PM

January 06, 2025

Digital Library Federation

NDSA Leadership 2024 Year in Review

Happy New Year! As we begin 2025, we wanted to take a moment to look back at what we’ve done over the past year. Please have a look at NDSA’s accomplishments – and feel free to reach out to NDSA with any questions on how you can get involved!

NDSA Leadership

NDSA Leadership has achieved significant milestones this year, with a focus on refining our mission and vision and growing NDSA into a more fully-fledged professional organization in response to members’ stated support needs.

In January 2024, NDSA Leadership released an RFI seeking expressions of interest from potential new host organizations. Over the course of the RFI process, we have been exploring alignment with other organizations, assessing their suitability as long-term hosts – and clarifying what NDSA can bring to the table for its host organization. To that end, in June 2024, we updated NDSA’s foundational strategy and outlined specific activities and initiatives to be completed in the next 3-5 years that will strengthen and stabilize NDSA’s shared governance, enhance membership services and outreach, and increase transparency through new communication strategies.

After the exploratory conversations kicked off through the RFI, in June we established a recurring monthly meeting with our prospective new host, with the goal of establishing a sustainable funding model and business plan. Together, we developed and conducted a member fee survey to assess members’ willingness and ability to sustain the organization through various types of financial contributions. Preliminary analysis of the survey data supports a fee-for-membership model as one element of a set of diverse funding sources, and underlines NDSA’s need for financial resilience. We hope to release a fuller report documenting the survey findings in the coming months.

Concurrently with the member fee survey, we also submitted a preliminary proposal for an IMLS planning grant in September. If awarded, the grant funding would allow NDSA to hire a part-time program coordinator who will assist with developing a business plan and transitioning NDSA to a fully self-funded model. We plan to begin drafting the full grant proposal soon, with anticipated milestones stretching into 2025.

Finally, NDSA Leadership is currently working on updated public governance documentation, integrating updates to bylaws and clarifying roles within the Coordinating Committee (CC). On the recommendation of the Membership Working Group, which suggested several improvements to the NDSA member experience, we have decided to add a CC Secretary role to oversee critical records, and we are establishing a Standing Membership Group that will focus on approving and onboarding new members – and supporting all NDSA members through enhanced outreach. Look for announcements about these efforts coming soon!

Membership Updates

NDSA welcomed eight new members in 2024, representing a diverse mix of academic, commercial, and nonprofit organizations:

As existing members, the new year is a good time to make sure your organization’s contact information is up to date. A simple form is available to assist with this process.

Stay tuned for a recap of the activities and accomplishments that our Interest Groups and Working Groups achieved in 2024!

The post NDSA Leadership 2024 Year in Review appeared first on DLF.

by Carol Kussmann at January 06, 2025 02:45 PM

Open Knowledge Foundation

‘Leveraging Open Data for the Benefit of Society’, a workshop with Open Knowledge Greece

‘Open Data Workshop’ took place on Thursday, December 12, 2024, at the new conference center of the Region of Central Macedonia. The event, organised by the Digital Governance Sub-region, focused on the values of open data in modern societies, aiming to promote transparency, innovation, and sustainable development. 

Τhe event featured panels and discussions on using public and open data to create digital tools that address contemporary challenges for the benefit of citizens. Charalampos Bratsas, President of OKFN Greece and Assistant Professor at the International Hellenic University (IHU), coordinated the panels, highlighting the importance of human oversight in data management and the need for enhanced transparency and interoperability in the digital transformation of both the public and private sectors. Additionally, the workshop opened with greetings from representatives of the public sector, such as Nikos Tzollas, Deputy Regional Governor for Digital Governance. Afterwards, the invited speakers, among which Kostas Gioulekas, Deputy Minister of the Interior (sector of Macedonia and Thrace), Kostas Vassilopoulos, Deputy Mayor for Digital Policy and E-Governance, Athanasios Thanopoulos, President of ELSTAT, Theophilos Mylonas, President of SETPE, etc., discussed on issues related to the values of open data in contemporary societies through different discussion panels, including “Leveraging Open Data to Create Tools for the Benefit of Citizens”, “Open Data: Infrastructure – Challenges” and “Availability of Open Data”.

OKFN Greece also hosted a dedicated panel, titled “Workshop: Open Data Repositories”, where Lazaros Ioannidis, researcher at OKFN Greece and PhD candidate at IHU, presented the digital platform developed within the UPCAST Project to host an open data marketplace. Additionally, the launch of the ‘Open Up Thessaloniki Climate 2025’ competition was announced, inviting participants to register or contribute data to develop the competition’s open data platform.

Overall, the ‘Open Data Workshop’ was an important step towards raising awareness about open data and the challenges that arise from its implementation in society and the public sector. The contribution of OKFN Greece to the success of this event was crucial, setting the tone for future developments in the field of digital governance and the use of open data to address contemporary challenges.

by Open Knowledge Greece at January 06, 2025 01:47 PM

HangingTogether

Examining library structures to scale research support services: Insights from an OCLC RLP leadership roundtable

The following post is part of a series that documents findings from the RLP leadership roundtable discussions.

Research libraries are experiencing increasing demand for research support services, such as open research and data management, research analytics, and systematic reviews, often in collaboration with other campus partners. This presents significant challenges for effectively resourcing and scaling these services.

RLG logo

The OCLC Research Library Partnership (RLP) convened the Research Support Leadership Roundtable in October 2024 to discuss how libraries are making both incremental and large-scale changes to scale and resource their research support services.

The roundtable included 45 participants from 37 institutions in four countries, who engaged in four separate discussions focused on the evolving landscape of research support:

Binghamton UniversitySmithsonian InstitutionUniversity of Nevada, Reno
British LibraryStony Brook UniversityUniversity of Pittsburgh
Carnegie Mellon UniversitySyracuse UniversityUniversity of Southern California
Clemson UniversityTemple UniversityUniversity of Sydney
Colorado State UniversityTufts UniversityUniversity of Tennessee, Knoxville
George Washington UniversityUniversity of CalgaryUniversity of Texas at Austin
Getty Research InstituteUniversity of California, RiversideUniversity of Toronto
Hofstra UniversityUniversity of California, San DiegoUniversity of Utah
Institute for Advanced StudyUniversity of DelawareUniversity of Warwick
Monash UniversityUniversity of GlasgowVanderbilt University
Montana State UniversityUniversity of Illinois Urbana-ChampaignVirginia Tech
Ohio State UniversityUniversity of Leeds
Rutgers UniversityUniversity of Minnesota

Our conversations focused on library organization and staffing, and participants were asked to consider these framing questions:

  1. Briefly describe how research support services are currently staffed and provisioned in your library. (For example, is support provided by subject liaisons? Librarians in functional roles? A combination of the two? Other?)
  2. If there are challenges with the configuration, describe these. Where do you most need to grow?
  3. If you are examining changing operational structures, please describe. Are there models you are considering emulating? Why?

This post offers a synthesis of our discussions. RLP leadership roundtables observe the Chatham House Rule; no specific comments are attributed to any individual or institution.

RLP libraries are innovating with library structures to scale research support

Resourcing research support services at an adequate level is a universal pain point among RLP libraries. What differs is how libraries have organized their staffing and services to meet these needs, and we heard from RLP libraries that are structured across a spectrum of organizational configurations.

At one end of that spectrum are libraries that rely primarily upon a decentralized cadre of subject liaisons to deliver research support. Liaisons provide a direct contact for users, roughly in parallel with the academic organization of the university. While most RLP institutions participating in the discussion rely on liaison librarians for some degree of research support, only a couple of institutions reported relying primarily upon liaisons to provide research support. And these libraries anticipate reconfiguration toward a mixed model as vacancies occur.

At the other end of the continuum are libraries that deploy a centralized functional model. Here, service-oriented teams manage library tasks—such as research data curation, copyright consulting, or collection development—across all disciplines, rather than assigning multiple responsibilities to individuals within a single subject area. Duane Wilson notes in a recent historical literature review that since 2011, more libraries have moved to this model, with librarians focusing on specialized functions, such as collection development, scholarly communication, and research impact.[1] Indeed, 6 of 36 university libraries in the US, UK, and Australia participating in the roundtable have shifted to a functional model.

Most libraries participating in the discussion, however, fall somewhere in the middle of the spectrum, with approximately two-thirds of institutions deploying a combination of these strategies. Library scholar Sheila Corrall has called this a “mixed structure,” with a combination of functional librarians supporting services like scholarly communications and data curation, and liaison libraries supporting one or more disciplinary areas.[2] The growth of specialized research support services—such as scholarly communication, data management, and research impact—has further driven this shift, creating increasingly mixed or matrixed approaches to service delivery.

Graph showing a range of structures from decentralized subject liaisons to mixed structure to functional teams. Research libraries organize research support across a spectrum of organizational configurations

Roundtable discussions revealed that many RLP libraries are experimenting with organizational structures to increase capacity for research support. While a few institutions have eliminated legacy liaison roles altogether, most are being “reorganised around the edges instead of completely discarding their old structure and beginning anew.”[3]

Many RLP libraries are experimenting with organizational structures to increase capacity for research support

Most libraries use a mixed structure of liaison and functional roles

Overall, roundtable participants expressed differing opinions about their continued use of a mixed organizational structure heavily reliant upon distributed subject liaisons. Many value how the liaison model supports collections-focused research and personalized support to faculty and students. But others expressed frustration with “historical positions” offering bespoke services that are neither scalable nor strategic.

One public US university library described its research support services as having developed in an ad hoc manner, resulting in a mixed structure. While this decentralized environment has encouraged innovation and experimentation, the participant noted it was “neither coordinated nor strategic.” Recognizing the unsustainability of this arrangement, the library is reorganizing, with the intention of better addressing under-supported areas like research services.

To scale within the existing liaison framework, several libraries are experimenting with team-based approaches. A private US institution, for example, organized its subject liaisons into functional teams like research impact and data services, but with mixed results. While each librarian has deepened their expertise by focusing on a specific functional service area, librarians struggled to do “double duty,” balancing functional responsibilities with subject expertise and networks—a challenge worsened by shrinking professional development budgets. Another public US institution tried a similar staffing configuration but found it unsuccessful, ultimately reorganizing liaison librarians into fully functional roles. Two other institutions are currently testing similar strategies.

Several participants expressed disappointment with their library’s inability to scale research support services using the mixed model but see near term change as unlikely, due to a “lack of political will.” But another participant sees this matrixed structure “mostly working” to scale research support, as subject liaisons support a “middle zone” of service in an area like research data management, reserving the most specialized work for the dedicated RDM librarians.

Benefits and challenges of a mixed organizational structure

Peppered throughout our roundtable discussions were many comments about the benefits and challenges of the mixed organizational structure. Often, things that many perceived as benefits simultaneously present challenges.

Benefits

  1. Highly personalized services. Liaison librarians provide customized support, cultivating strong relationships with faculty and students.
  2. Library visibility in academic units. Subject liaisons work closely with colleges and departments and are well-informed about faculty activities.  
  3. Experimentation and innovation. A decentralized environment, where librarians enjoy high autonomy, can foster innovative services and approaches.
  4. Familiarity with the service model. Distributed teams of subject liaisons have been the status quo at academic libraries for over fifty years, and both librarians and users have a high degree of comfort with this model.
  5. Organizational resilience. Distribution of knowledge across a broad team can mitigate disruptions when vacancies occur.

Challenges

  1. Lack of scalability. High touch, bespoke services cannot scale effectively and may result in duplicative efforts.  
  2. Uneven service quality. Many participants expressed frustration that service provision depends heavily upon individual librarians, leading to inconsistent experiences for patrons, an issue also frequently mentioned in the library literature.[4]
  3. Lower visibility by campus administrative units. While high visibility in colleges/faculties is a benefit of deploying liaison librarians, the distributed model can obscure library activities at the institutional level, reducing visibility with campus leaders and units.
  4. Un-strategic deployment of resources. At least a dozen individuals described their structures as an obstacle to strategic alignment with institutional priorities, hampering responses to institutional changes.
  5. An entrenched legacy model that is difficult to change. This is the flip side of patron and librarian comfort with the service model. There are high switching costs to move to another model.
  6. Matrixed work can be difficult to coordinate. Functional specialists are meant to be centralized and work across disciplines, and liaisons are in direct contrast because they are subject experts; coordinating their activities is inherently complex. Poor communication across internal teams can create coordination gaps, leading to fragmented engagements and undermining the library’s organizational brand. To address this, one institution introduced cross-training to improve referrals, awareness, and workflows.
  7. “Double-duty” pressure. When subject librarians assume functional leadership roles, they may struggle to remain engaged, skilled, and networked in both subject and functional domains. While imperfect, this approach does offer a way to deepen research support expertise without radically restructuring the library.

A few libraries are shifting to a functional or service-oriented model

While most RLP libraries participating in this roundtable rely on a mixed structure, a few have transitioned to a functional or service-oriented model. These shifts, driven primarily by the need for greater scalability and strategic alignment with institutional priorities, can feel radical for both librarians and users.

One UK institution adopted a functional model several years ago, driven by the need to create capacity for open access support. The change has been successful, providing better support to users, and with the additional benefit of helping library staff and services “feel more embedded in the university.” The library plays a larger role on campus and is now a part of institutional strategy and planning conversations.

In the US, a public university also shifted to a functional model, reorganizing subject liaisons into two teams: student success (serving undergraduates) and research support and open scholarship (targeting faculty, researchers, and graduate students). The previous subject liaison model was seen as unsustainable.

After a long period of consultation, another university library is transitioning from a liaison model which delivered quality one-on-one service to researchers but lacked scalability and agility. Seeking greater research support capacity, the library redistributed education, engagement, and research responsibilities across three functional teams. The research services team will be further subdivided into research impact and publishing support.

Lego blocksPhoto by Sen on Unsplash

One participant described research libraries as being at a significant moment of change, as traditional liaison models—centered on collection development, information literacy, and reference support—are less effective as research support demands increase. Collection development work is also increasingly centralized. [5]  To scale services for one large research university with more than 80,000 students and nearly 20,000 faculty and staff, the move to a functional model should support more agile, scalable service delivery, in response to institutional needs. The institution is implementing a tiered approach: ideally, 80% of support will be delivered via self-service access by users, followed by small-group workshops and, lastly, specialized high-touch support deployed strategically—not as the default.

Most institutions that shifted to a functional model from a mixed structure described a fairly rapid transition, following extensive study and consultation. However, one public US institution made a gradual, decade-long transition from the liaison model, primarily by reallocating vacant liaison roles to more strategic functional roles in areas like research data management, scholarly communications, and teaching and learning.

Impacts on librarians

Library reorganizations have significant impacts on workers, and roundtable participants described a gamut of responses from librarians during their reorganizations. While some librarians thrive, developing new skills and expertise, others struggle, grieving the loss of professional identities and fulfilling responsibilities. Re-skilling is also a challenge, as increased needs for professional development, training, and conference attendance often collide with institutional austerity measures.

Relationships and organizational intelligence

A significant challenge reported by one public US institution was the loss of faculty relationships. Subject liaisons often attended departmental meetings and built deep connections. Structural changes can disrupt these relationships. Faculty members accustomed to contacting a familiar subject liaison may balk when asked to seek assistance through a general email address, and both users and librarians may quietly revert to using the former model.

Benefits and challenges of the functional model

Like the mixed organizational structure, the functional model has both benefits and challenges:

Benefits

  1. Scalability of service. Many participants see the functional, service-based approach as the only solution to the thorny problem of scaling support for large student and researcher populations. In general, these participants believe libraries must shift capacity away from time intensive personalized support.  
  2. Deepened research support. It’s not just about scaling services, but it’s also often providing a deeper level of expertise to library users. Deployment of functional specialists in areas like copyright, open research, and data management can offer greater expertise to campus communities.
  3. Alignment with institutional priorities. Functional teams allow libraries to strategically allocate staff and resources, aligning more effectively with institutional goals. The decentralized provision by subject liaisons was frequently described by participants as “un-strategic” or “uncoordinated.”
  4. Equitable distribution of work. Work for functional teams is triaged more centrally, with the benefit of assigning work more evenly and equitably, reducing reliance on individual initiative. These workflows can also support analysis and reporting of library activities.
  5. Visibility and engagement with campus units. Organizing activities into functional units enhances the library’s visibility to institutional leaders and campus partners, whereas a distributed service model may seem complex and opaque. This approach can support cross-campus social interoperability and strengthens the library’s strategic role.
  6. Strategic campus partnerships. Closely related, functional teams can offer stronger frameworks for partnerships with other campus units. For example, one institution described how their functional research services team will focus on engagement and partnership through “central portfolios” with other campus units like the research office and graduate college. This partnership approach can help the library maintain alignment with institutional goals while educating stakeholders on the library’s evolving value proposition.  

Challenges

  1. Weakened relationships with faculty and units. Faculty appreciate the personal relationships and high touch offered by the liaison model, and they are often reluctant to change. A less distributed model can also diminish library knowledge of college/faculty activities.
  2. Human impact. The transition can be difficult for some workers who may feel a loss of professional identity and purpose, and some will struggle to thrive.  
  3. Retraining needs. Many librarians must gain expertise in new areas, but professional development and training can be hampered by budget constraints.
  4. Reduced resilience. One participant described the functional model as creating “a single point of failure,” when expertise resides in one employee.
  5. New workflows. Previous ways of working may not adequately manage and distribute tasks. One institution has implemented an internal tracking/ticketing system for managing requests coming to a central email address.
  6. Hiring challenges. Some participants said they found recruitment for functional roles challenging, as candidates with technical and functional skills are also in high demand across other industries offering more generous compensation.  

Final thoughts

Reflecting on these roundtable discussions, I see an urgent need for libraries to evolve in a complex, rapidly changing environment. Research libraries, particularly those affiliated with prestigious research universities, must develop services that align with the institution’s research and teaching missions. Institutional complexity often slows change, especially when stakeholders are invested in established structures, relationships, and workflows.

Many libraries continue to leverage legacy service models developed in an earlier era—when collection development, information literacy, and reference support were primary needs. These models predate online catalogs, WorldCat, the internet, digitized resources, linked data, e-books, and AI. In our leadership roundtable discussions, participants expressed a desire to explore new organizational structures. Yet, many acknowledged that near-term changes remain unlikely due to steep switching costs—the costs of shifting from one approach to another, such as new organizational structures, workflows, technologies, and relationships. Transitioning to new models demands effort, planning, political capital, change management, and patience.

However, as my colleague Brian Lavoie has written about elsewhere, there are also status quo costs to consider. These arise from avoiding change and continuing existing practices. Roundtable participants delineated many switching costs in our discussion—things like limited capacity for research support, reduced visibility among stakeholders, uneven service provision, and difficulty strategically deploying resources to support institutional priorities. Switching costs may be high, but status quo costs may be higher, with potential risks of diminishing the library’s value, autonomy, and access to resources.  

Switching costs may be high, but status quo costs may be higher.

There’s no silver bullet. No universal solution will work for every library. Tradeoffs are inevitable, and each library must consider its strategic priorities, resources, work climate, and overall business needs. My hope is that our roundtable discussions—and this synthesis—provide support to research libraries as they navigate change.


[1] Wilson, Duane. “Constant Change or Constantly the Same? A Historical Literature Review of the Subject Librarian Position.” College & Research Libraries 85, no. 7 (November 1, 2024): 1035. https://doi.org/10.5860/crl.85.7.1035.

[2] Corrall, Sheila. 2014. “Designing Libraries for Research Collaboration in the Network World: An Exploratory Study”. LIBER Quarterly: The Journal of the Association of European Research Libraries 24 (1): 17-48. https://doi.org/10.18352/lq.9525.

[3] Stueart, Robert D., and Barbara B. Moran. Library and Information Center Management. 7th ed. Westport, CT: Libraries Unlimited, 2007, 188.

[4] Wilson, 11.

[5] Wilson, 2.

The post Examining library structures to scale research support services: Insights from an OCLC RLP leadership roundtable   appeared first on Hanging Together.

by Rebecca Bryant at January 06, 2025 01:00 PM

Open Knowledge Foundation

Event: Join us to discuss the role of tech in the so many elections of 2024

Mark your calendars! On 14 January, 2025, the Open Knowledge Foundation will host an engaging online event to dive deep into the intersection of technology and democracy. Together, we’ll reflect on the transformative Super Election Year 2024, exploring how technology shaped electoral processes worldwide and discussing the future we can build together.

🗓 Date: January 14th, 2025
🌍 Where: Online
🕑 Time: 14:00 UTC

Confirmed Speakers

💡 Stay Tuned: More details on speakers and sessions coming soon.

This event is part of The Tech We Want initiative—our ambitious effort to reimagine how technology is built and used. We believe software should be useful, simple, long-lasting, and, most importantly, focused on solving real-world problems.

Why This Matters

Elections are a cornerstone of democracy, but they are not immune to the challenges posed by rapid technological change. From digital voter registration systems to combating misinformation and ensuring secure, transparent electoral processes, the role of technology in 2024 elections was unprecedented.

This online gathering is a follow-up to our 2023 roundtable discussions on Digital Public Infrastructure for Electoral Processes, where experts from across the globe highlighted the need for inclusive, open, and reliable tech to support democratic practices. If you missed it, you can learn more about the initiative here.

What to Expect

The event will bring together policymakers, technologists, activists, and thought leaders to:

This is more than a conversation—it’s a call to action. Let’s ensure that the next generation of technology is built to empower citizens, uphold transparency, and strengthen democratic systems.

The discussions will directly contribute to a submission for the UN Special Rapporteur on Freedom of Expression’s 2025 Thematic Report on Freedom of Expression and Elections in the Digital Age. This is a unique opportunity to ensure that our collective vision for responsible and impactful technology influences global policies and strengthens the democratic process worldwide.

Join the Movement

Whether you’re a developer, advocate, or simply curious about the intersection of technology and democracy, this event is for you. Let’s come together and shape The Tech We Want—tech that works for everyone.

For updates, follow the Open Knowledge Foundation on Mastodon, Bluesky, X and LinkedIn.

We can’t wait to see you there! 🚀

by Open Knowledge Foundation at January 06, 2025 11:10 AM

Open Data Editor: How and why we are integrating AI into the app

In collaboration with Romina Colman

How can AI help non-technical users validate and improve the quality of their data in the Open Data Editor, taking into account transparency, privacy, and functionality?

In this blog post, we reflect on the collaboration, process and outcomes of integrating an AI feature into the Open Data Editor (ODE) to help its users better understand their tables of data. We describe the challenges for which AI could provide a solution, our exploration of potential AI features, and the first implemented AI feature to help users better understand their data. We reflect on this integration, and finally outline the roadmap for further AI features for ODE to further improve its functionality and user experience.

Objectives and challenges

The current functionality of the Open Data Editor is aimed at providing “data validation and basic cleaning” capabilities to improve the quality of data in tables. In plain language, ODE checks for errors in tables according to specific rules. In ODE, these rules are defined by Frictionless Data, an OKFN initiative that provides standards and software implementations to improve data quality and interoperability.

ODE is a unique tool in that it offers these capabilities to those non-technical data practitioners who typically analyse individual data files or data from public sources in a more ad hoc manner. Existing ‘data observability’ tools, such as Metaplane and Monte Carlo Data, are typically aimed at a technical audience to facilitate robust integration into large-scale data pipelines. Building a data preparation tool for non-technical users remains a major challenge and requires special attention to the interface, level of abstraction, and interactions. Writing the code is therefore only one side of the coin. A combination of soft and technical skills is needed to ensure that complex technical terms and feature implementations are understandable and transparent to those who are not necessarily exploring how something that looks simple, such as an AI button in the app, works.

The collaboration between the ODE team and myself, Madelon Hulsebos, as an AI Consultant, was prompted by a desire to explore how Artificial Intelligence (AI) features could be used to enhance the core functionality of ODE and help users.

First pre-work meeting

As a first step, the team’s Product Owner, Romina Colman, and I met to discuss the status of the Open Data Editor and the shortcomings of related tools that the team had identified through a survey. We found that it can be difficult for users to understand how to use the tool and how to interpret its interactions (e.g. error messages). We also concluded that for non-technical users of ODE, it is important to provide transparency about what is happening, why and how, and to ensure the privacy of the user’s data.

The key question, therefore, is how AI can enhance the ODE’s ability to validate and improve data quality in a way that is transparent, privacy-preserving and trustworthy to a non-technical audience.

Exploring AI-driven features

The Open Data Editor team had initially identified 3 ideas:

  1. Improving key metadata elements
  2. Suggesting analysis questions
  3. Reporting table statistics

Based on this, it seemed that the first idea, semantic metadata refinement, such as “column descriptions” and “descriptive column names”, was at the core of ODE’s capabilities and that this feature would significantly help users to better understand their data.

I reviewed the ideas generated by the ODE team and enriched them with further suggestions that would improve the core functionality as well as some ideas for extending the functionality of the application. The ideas were described along the following list of dimensions:

I proposed three additional AI features that would assist users in using the ODE by 1) interpreting error messages, 2) answering questions about the ODE documentation, 3) summarising and contextualising the table metadata generated by ODE.

In addition to these features, I identified three other features that would extend ODE capabilities:

  1. Suggesting relevant data quality checks for the dataset at hand
  2. Suggesting ‘data repair’ based on error messages after running quality checks
  3. Suggesting a complete ‘data processing plan’ based on the dataset and the intended analysis

These features will proactively guide a non-technical user in validating and improving the quality of their data with ODE, extending its current capabilities.

Refinement and prioritisation

The team met to reflect on the ideas from different perspectives: backend, frontend, and product. The aim of this meeting was to come up with a list of priorities and an action plan. The ODE team prioritised four ideas based on functionality, in order:

  1. User-friendly error interpretation
  2. Answering questions about documentation
  3. Generating table statistics
  4. Improving table metadata (column names)

The team crystallised these feature ideas and I provided additional input based on open questions.

Based on insights from a few testing sessions with community members, the implementation effort versus the release timeline, and coordination with the Frictionless community, the team decided to start improving table metadata by suggesting improved column names and column content descriptions.

Development and experimentation

After identifying the AI feature that the team wanted to focus on first, they developed an initial implementation of the feature, taking into account the key values:

  1. Privacy of user data
  2. Functionality
  3. Transparency of the use of AI and references to OpenAI’s terms and conditions
You can review the complete issue created for the AI integration here: https://github.com/okfn/opendataeditor/issues/635

After providing instructions for the AI implementation, Romina tested it from the product side. Later, she had a meeting with me to ensure that the ODE team had not overlooked any relevant elements in the implementation. In fact, during this call, I noticed that the AI box, which asks the user to insert an OpenAI key, did not contain any references to explain how to obtain it. We added a link to the OpenAI documentation, and as OKFN has just launched a general course to help people work with open data, we asked the instructors to explain what a key is. We also added text and a link to allow users to check the terms and conditions of OpenAI. Finally, the link to this blog post will be added to the ODE’s user guide, so that people can also read more about the implementation process and decisions there.

The current pipeline for the AI feature is as follows:

  1. Given a particular table uploaded to ODE, the user clicks on the “AI” button.
  1. A dialogue informs the user that only the table header is sent to OpenAI.

[Note to readers: On 18 December, our team held the first group user test for the Open Data Editor stable release. One of the participants suggested changes to this message, such as including the name of the third party that ODE uses for AI integration (OpenAI), and some additional clarification regarding the steps that follow when the user clicks ‘Confirm’. We will be releasing a new version with these changes soon.]

  1. On proceeding, the user is asked for their OpenAI key.
  1. On proceeding, the user is shown the editable prompt that will be sent to OpenAI.
  1. On proceeding, ODE makes an LLM call with the key, the table header, and the current prompt asking to provide per column 1) improved column name, 2) description.
  1. The user is shown the LLM generated table description.

Upon further review, we identified several revisions that were important for the first release of ODE with the integrated AI feature:

  1. The user experience for activating the feature, i.e. generating column-level metadata, should be more descriptive than “AI”, e.g. “describe this table”.
  2. The user should be told where and how to find their OpenAI key, the terms and conditions of OpenAI, and that the data will not be stored in ODE or shared externally.
  3. The prompt to the LLM should not be displayed and editable in advance to avoid confusion. Instead, it can be shown/edited after the initial output is generated for advanced use.
  4. The prompt should be and the LLM should be forced to adhere to the desired “structured output” of the metadata (e.g. provide a schema and output a json). Requirements that cannot be enforced in this way can be built into the natural language prompt, e.g. that the output should be short, and that a particular language is required.
  5. Persist the generated output as metadata for future use or publication, and make the output useful. For example, give the user the option (via a button) to use the generated column names to replace the current ones.

AI roadmap for the Open Data Editor

Integrating AI into the Open Data Editor can have significant value in providing a data quality validation and improvement tool that is accessible to non-technical users.

  1. Reuse the built-in AI feature to extend capabilities that fit into the same pipeline as described above, so taking as input the table or just its column names, making a single LLM call to generate as output, for example, data validation rules or data analysis questions.
  2. Link the error message from an executed data validation rule with the context of the ODE features and how to use them (e.g. from the user manual, code or documentation) to generate suggestions on how to “repair” the data.
  3. Question-answering through the documentation, so that users can ask any question and be directed to the right information in the documentation, for example using a retrieval-assisted generation approach such as that developed by the Scikit-learn team (see blog post). Given the effort this would require, it may be efficient to develop this pipeline together with other product teams in OKFN.

Conclusion

The key question was how AI can strengthen the Open Data Editor’s ability to validate and improve data quality in a transparent, privacy-preserving and trustworthy way.

In this blog post, we reflected on the process and outcome of the AI feature, and outlined a roadmap for future integrations of AI functionality in ODE. The team successfully integrated its first AI feature: using an LLM to generate enhanced column names along with column descriptions, which helps users understand their data and improve metadata. The implementation of the feature minimises the amount of data actually passed to the LLM: only the table column names are provided, ensuring privacy. The user is actively informed of what is being shared with the LLM, ensuring transparency. When sending the table metadata to the LLM, the prompt is preset in ODE, while the LLM call restricts the generated metadata to be formatted in a structured way, ensuring trustworthy output.

Overall, the final AI feature strengthens the core of ODE by helping users better understand their data before anything is done with it, taking into account the key values of transparency, privacy and trustworthiness.

Read more

by Madelon Hulsebos at January 06, 2025 10:23 AM

January 05, 2025

David Rosenthal

Engineering For The Long Term

Content Warning: this post contains blatant self-promotion.

Contributions to engineering fields can only reasonably be assessed in hindsight, by looking at how they survived exposure to the real world over the long term. Four of my contributions to various systems have stood the test of time. Below the fold, I blow my own horn four times.

Four Decades

X11R1 on a Sun/1
Wikipedia has a pretty good history of the early days of the X Window System. In The X Window System At 40 I detailed my contributions to the early development of X. To my amazement 40 years after Bob Scheifler's initial release it is still valiantly resisting obsolescence. I contributed to the design, implementation, testing, release engineering and documentation of X11 starting a bit over 39 years ago. At least my design for how X handles keyboards is still the way it works.

All this while I was also working on a competitor, Sun's NeWS — which didn't survive the test of time.

Nearly Three-and-a-Half Decades

One of the things I really enjoyed about working on NeWS was that the PostScript environment it implemented was object-oriented, a legacy of PostScript's origins at Xerox PARC. Owen Densmore and I developed A User‐Interface Toolkit in Object‐Oriented PostScript that made developing NeWS applications very easy, provided you were comfortable with an object-oriented programming paradigm.

I think it was sometime in 1988 while working on the SunOS 4.0 kernel that I realized that the BSD Vnode interface was in a loose sense object-oriented. It defines the interface between the file system and the rest of the kernel. An instance of BSD's type vnode consisted of some instance data and a pointer to an "ops vector" that defined its class via an array of methods (function pointers). But it wasn't object-oriented enough to, for example, implement inheritance properly.

This flaw had led to some inelegancies as the interface had evolved through time, but what interested me more was the potential applications that would be unleashed if the interface could be made properly object-oriented. Instead of being implemented from scratch, file systems could be implemented by sub-classing other file systems. For example, a read-only file system such as a CD-ROM could be made writable by "stacking" a cache file system on top, as shown in Figure 11. I immediately saw the possibility of significant improvements in system administration that could flow from stacking file systems.

Evolving the Vnode Interface: Fig. 11
I started building a prototype by performing major surgery on a copy of the code that would become SunOS 4.1. By late 1989 it worked well enough to demonstrate the potential of the idea, so I published 1990's Evolving the Vnode Interface. The paper describes a number of Vnode modules that can be stacked together to implement interesting functions. Among them was cache-fs, which layered a writable local file system above a local or remote read-only file system:
This simple module can use any file system as a file-level cache for any other (read-only) file system. It has no knowledge of the file systems it is using; it sees them only via their opaque vnodes. Figure 11 shows it using a local writable ufs file system to cache a remote read-only NFS file system, thereby reducing the load on the server. Another possible configuration would be to use a local writable ufs file system to cache a CD-ROM, obscuring the speed penalty of CD.
Over the next quarter-century the idea of stacking vnodes and the related idea of "union mounts" from Rob Pike and Plan 9 churned around until, in October 2014, Linus Torvalds added overlayfs to the 3.18 kernel. I covered the details of this history in 2015's It takes longer than it takes. In it I quoted from Valerie Aurora's excellent series of articles about the architectural and implementation difficulties involved in adding union mounts to the Linux kernel. I concurred with her statement that:
The consensus at the 2009 Linux file systems workshop was that stackable file systems are conceptually elegant, but difficult or impossible to implement in a maintainable manner with the current VFS structure. My own experience writing a stacked file system (an in-kernel chunkfs prototype) leads me to agree with these criticisms.
I wrote:
Note that my original paper was only incidentally about union mounts, it was a critique of the then-current VFS structure, and a suggestion that stackable vnodes might be a better way to go. It was such a seductive suggestion that it took nearly two decades to refute it!
Nevertheless, the example I used in Evolving the Vnode Interface of a use for stacking vnodes was what persisted. It took a while for the fact that overlayfs was an official part of the Linux kernel to percolate through the ecosystem, but after six years I was able to write Blatant Self-Promotion about the transformation it wrought on Linux's packaging and software distribution, inspired by Liam Proven's NixOS and the changing face of Linux operating systems. He writes about less radical ideas than NixOS:
So, instead of re-architecting the way distros are built, vendors are reimplementing similar functionality using simpler tools inherited from the server world: containers, squashfs filesystems inside single files, and, for distros that have them, copy-on-write filesystems to provide rollback functionality.

The goal is to build operating systems as robust as mobile OSes: periodically, the vendor ships a thoroughly tested and integrated image which end users can't change and don't need to. In normal use, the root filesystem is mounted read-only, and there's no package manager.
Since then this model has become universal. Distros ship as a bootable ISO image, which uses overlayfs to mount a writable temporary file system on top. This is precisely how my 1989 prototype was intended to ship SunOS 4.1. The technology has spread to individual applications with systems such as Snaps and Flatpak.

Three Decades

The opportunity we saw when we started Nvidia was that the PC was transitioning from the ISA bus to version 1 of the PCI bus. The ISA bus' bandwidth was completely inadequate for 3D games, but the PCI bus had considerably more. Whether it was enough was an open question. We clearly needed to make the best possible use of the limited bandwidth we could get.

Nvidia's first chip had three key innovations:
  1. Rendering objects with quadric patches not triangles. A realistic model using quadric patches needed perhaps a fifth of the data for an equivalent triangle model.
  2. I/O virtualization with applications using a write-mostly, object-oriented interface. Read operations are neccessarily synchronous, whereas write operations are asynchronous. Thus the more writes per read across the bus, the better the utilization of the available bus bandwidth.
  3. A modular internal architecture based on an on-chip token-ring network. Thie goal was that each functional unit be simple enough to be designed and tested by a three-person team.
SEGA's Virtua Fighter on NV1
The first two of these enabled us to get Sega's arcade games running at full frame rate on a PC. Curtis Priem and I designed the second of these, and it is the one that has lasted:
The importance of the last of these was that it decoupled the hardware and software release schedules. Drivers could emulate classes that had yet to appear in hardware, the applications would use the hardware once it was available. Old software would run on newer hardware, it would just see some classes it didn't know how to use. One of our frustrations with Sun was the way software and hardware release schedules were inextricably interlinked.

Two-and-a-Half Decades

Last October I celebrated the LOCKSS Program Turns 25. Vicky Reich explained to me how libraries preserved paper academic journals and how their move to the Web was changing libraries role from purchasing a copy of the journal to renting access to the publisher's copy, and I came up with the overall peer-to-peer archtecture (and the acronym). With help from Mark Seiden I built the prototype, and after using it to demonstrate the feasibility of the concept, also used it to show vulnerabilities in the initial protocol. In 2003 I was part of the team that solved these problems, for which we were awarded the Best Paper award at the Symposium on Operating System Principles for Preserving peer replicas by rate-limited sampled voting.

by David. (noreply@blogger.com) at January 05, 2025 08:38 PM

January 03, 2025

LibraryThing (Thingology)

January 2025 Early Reviewers Batch Is Live!

Win free books from the January 2025 batch of Early Reviewer titles! We’ve got 161 books this month, and a grand total of 2,766 copies to give out. Which books are you hoping to snag this month? Come tell us on Talk.

If you haven’t already, sign up for Early Reviewers. If you’ve already signed up, please check your mailing/email address and make sure they’re correct.

» Request books here!

The deadline to request a copy is Monday, January 27th at 6PM EST.

Eligibility: Publishers do things country-by-country. This month we have publishers who can send books to the US, the UK, Canada, Israel, Netherlands, Italy, Latvia, Lithuania, Luxembourg, Malta and more. Make sure to check the message on each book to see if it can be sent to your country.

Making the Best of What's Left: When We're Too Old to Get the Chairs ReupholsteredI See You've Called in DeadThe Delicate BeastOut of the Way ThingsStrandedRenegade Grief: A Guide to the Wild Ride of Life after LossMy Bonnie Lies UnderSacramento NoirDeaf HeavenMadame Sorel's LodgerThe VigilThe AfterdarkIt's Getting Hot in HereWe Are All Animals: Discover What You Have in Common with a Cat, a Bat, a Jellyfish, and 150 Other Animals!RodeoI Have Not Considered Consequences: Short StoriesWelcome to the Honey B&BNight HawksBreak My FallLoving Lily and the Magic SeedThis Thing Is StarvingMagic in Her BloodThe Dream Is the TruthFrom Apollo to Artemis: Stories from My 50 Years with NASAAeros & HeroesSafe Church: How to Guard Against Sexism and Abuse in Christian CommunitiesA History of Hazardous ObjectsDream CityMonument EternalInterficial ARTelligence: The Moments That Met MeDrag Racing in The 1970sCecile's SecretThe Four Queens of CrimeSerial Killer Support GroupBe a Blessing: Jewish Women on Celebrating LifeJewish Values in the Torah PortionMy Israeli Journey: A MemoirLaw of the LetterIsrael's War of Self-DefenseThe Magic in the Tragic: Inspirational Stories From the Hamas War against IsraelWe Are Black Jews: Ethiopian Jewry and the Journey to Equality in IsraelDevil on My TrailThe Blue DoorUntitled Goose GameWretchedCase Closed: Ian Bailey and the Murder of Sophie Toscan du PlantierDaughter of GoldMaking Time: A New Vision for Crafting a Life Beyond ProductivityDuchEavesdropperThe Daughter of RomeA Constant LoveGreen PasturesShattered SanctuaryKate Landry Has a PlanBig Money: Who Is In Control?ProphecyFantastic Lou: Little Comics from Real LifeTheir Cruel LivesKineticsWhat Was LostThe Third TempleNine-Year Cycle: A MemoirI See You've Called in DeadA Happy BeginningOmniviolenceThe Fairy Godmother's TaleLegend of the NarwhalsUnwanted: Abandoned But Never BrokenDoggie Haiku: A Novella in Haiku for Dog-LoversMoonroads: PoemsInternational Business Essentials You Always Wanted to KnowConsumer Behavior Essentials You Always Wanted to KnowEnd of Earth: A Collaboration of Poetry and PaintingPoems Momma Never Read MeWords Into Elephants: Tiny PoemsCareful What You HearThe Wretched and UndoneIl mio diario digital detoxArtificial AgentStormflowerThe Northern Pacific RailroadLove, Lies and TakebacksBehind the Ghost MetropolisA Cleansing FlameOf Drought and Fire: Two Natural Disasters in AustraliaMoral Machines: Instilling Ethics into Artificial MindsDemon CircleBarbara Ann Scott: Queen of the IceFuel for Thought: Ideas to Revolutionize Our Lives, Improve Civilization, and Build a Better WorldLove's JourneyThoughtful Aging: Restoring Honor to the Aging ProcessSuper Easy Mediterranean Diet Cookbook for Beginners: 2000 Days of Quick, Easy, HEALTHY AND Delicious Cooking Recipes for Busy Cooks, With Full Coloured Pictures, 30-Days MEAL Plan IncludedThe Viruses EnigmaJoker Joker DeuceBeyond the Divide: Finding Common Ground in a World of DifferencesThe GiftTrekking in Shangri-laThe Afterlife ProjectA Memory of Fictions (or) Just Tiddy-BoomThe Thing in Christmas TownSapphire Stone: Adventures of Pirate Captain SkyeThe AwakeningSafe in DeathAutism Asha: A Parent's Guide to Curing AutismParrot ProseGabe's Casts Of Courage: A Tale Of Bravery During DiagnosisLove, Lies, Lunar New YearBenji and Briana Become Booger DoctorsGolda's HutchBoy With WingsGEID-Star, The Empathic MachineSatyr Plays 2LilithDustriaJust Make Something: A Love Letter to CreativesMoney Talks: Everything Your Parents Should Have Taught You about Saving and Spending (but Didn't)Short StoriesThe Unsung HeroesUntil It Was GoneLove, Lies, and Local NewsWhy Should You Care?: Real Life Encouragement from Real Life ExperiencesWhen I Became NeverSecond ComingEvaDragonfire - The Rage of the DragonA Madness UnmadeDark Shadows HoverWelcome to CemeteryFlumeMechanics of Poetry: A Fresh Approach to the SubjectTracesThe Man Who Put On a Dress and Could See Through WallsEven If: Keeping Faith in the Face of AdversityThe Tale of Iśva RamanThe Shattered TruceThe Passion Paradox: A Guide for Introverts to Thrive, Connect, and Succeed in a Noisy WorldThe Duty of Memory: Le Devoir de MemoireIn the Darkness of Shards: Poems from a Broken PlaceMaya and Waggers: Mega GossipWishtone: Maladies That BindSomewhere Past the EndMagic CompendiumHuntress: Embers of RedemptionSabotage at Potion PinesCrimes After HoursPing's Mystery in PixiandriaDrawing FreedomMiracle BoyBeyond the SurfaceClose to Home: Exploring the Near Rather than the FarBrewing LoveNight LoreThe Germans Have a Word for ItOne in Vermilion May LiveWarrior's GraceCatians: Volume 1The Heart of a MonsterHow the HeussKid Moved the Mole-Lid!The ABCs of BIG-Impact Homemaking: Designing a Life of Balance and Integrity That Glorifies GodThe Anyones: Part I

Thanks to all the publishers participating this month!

Akashic Books Artemesia Publishing Autumn House Press
Baker Books Bellevue Literary Press Bethany House
Boss Fight Books CarTech Books Cinnabar Moth Publishing LLC
City Owl Press CMU Press Crooked Lane Books
Gefen Publishing House HTF Publishing Inlandia Institute / Inlandia Books
Kinkajou Press Legacy Books Press NeoParadoxa
New Door Books Prolific Pulse Press LLC PublishNation
Restless Books Revell Riverdale Avenue Books
Rootstock Publishing Running Wild Press, LLC Simon & Schuster
Somewhat Grumpy Press Tundra Books Type Eighteen Books
University of Nevada Press Unsolicited Press Vesuvian Books
Vibrant Publishers What on Earth! Wise Media Group
Yorkshire Publishing Zibby Books

by Abigail Adams at January 03, 2025 07:59 PM

January 02, 2025

Peter Murray

Issue 101: Data Centers

One of the very first issues of Thursday Threads was on data centers (2011). That issue had articles on a major Amazon Web Services outage, remote data centers powered by renewable energy, and videos about Google's and Meta's data centers. Unfortunately, I've found that the videos are lost to time. It is interesting that the concerns about data centers lives on. This post continues that thread with these topics:

Also recently on DLTJ:

Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page. If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.

AI-driven Data Center Development's Impact on Local Grid Users

U.S. map highlights cities with over 8% total harmonic distortion and significant data center activity in megawatts, 2024 data.Map from Bloomberg shows local average of sensors' worst total harmonic distortion readings from February to October; areas with an average of 8% or more are deemed as exceeding accepted industry limits.

AI data centers are multiplying across the US and sucking up huge amounts of power. New evidence shows they may also be distorting the normal flow of electricity for millions of Americans. This map shows readings from about 770,000 home sensors, with red zones indicating areas with the most distorted power. The problem is threatening billions in damage to home appliances and aging power equipment, especially in areas like Chicago and "data center alley" in Northern Virginia, where distorted power readings are above recommended levels. An exclusive Bloomberg analysis shows that more than three-quarters of highly-distorted power readings across the country are within 50 miles of significant data center activity. While many facilities are popping up near major US cities and adding stress to already fragile grids, this trend holds true in rural areas as well.
AI Power Needs Threaten Billions in Damages for US Households, Bloomberg, 27-Dec-2024

There has been much written about how data center development is moving into areas where it can soak up cheap excess electricity. This is the first I've heard about how data center power draws can distort or even harm the grid for existing customers. Set aside about how the article is framed as another way that the creation of AI-driven products is harmful; data centers are going to be built no matter what the purpose. As the nation's power grid is restructured to incorporate more renewable source and power storage mechanisms, this is yet another factor that will make that transition more challenging.

It isn't just power; water use — primarily for cooling — is also a concern: "Artificial intelligence technology behind ChatGPT was built in Iowa — with a lot of water".

Cryptocurrency Mining Rigs are not a Welcome Addition to this Texas Town

Corsicana, the seat of Navarro County, is best known for kicking off the Texas oil boom in 1894, when a 1,000-foot well meant to alleviate a water shortage instead turned up an oil field that extended for miles. In the century to follow, tens of millions of barrels of oil were pulled from the city—and Corsicana got rich... The oil fields are drying up. In Riot&aposs high temple of cryptographic computation, local officials think they&aposve found a stopgap. Some Corsicana residents aren&apost so sure. They see the facility as a blot on the landscape that threatens their property values, vulnerable energy grid, and quiet rural lifestyle. And they&aposre fighting back.
The World’s Biggest Bitcoin Mine Is Rattling This Texas Oil Town, Wired, 11-Sep-2024

The article highlights the tension between data center development and community concerns, drawing parallels between the historical oil industry and the current rise of bitcoin mining in the region. Despite claims from the developer that their operations will stabilize the grid, critics argue that cryptocurrency mining is a drain on resources and exacerbates noise pollution. Residents want local governments to address the issues where noise has led to health issues and disrupted lives.

It isn't just rural areas, either; suburban Virginia and downtown Chicago are also affected.

Data Center Development Causes Resurgence in Nuclear Power Interest

On Tuesday, Google announced that it had made a power purchase agreement for electricity generated by a small modular nuclear reactor design that hasn&apost even received regulatory approval yet. Today, it&aposs Amazon&aposs turn. The company&aposs Amazon Web Services (AWS) group has announced three different investments, including one targeting a different startup that has its own design for small, modular nuclear reactors—one that has not yet received regulatory approval.
Amazon joins Google in investing in small modular nuclear power, Ars Technica, 16-Oct-2024

Amazon Web Services has joined other tech giants like Google in investing in small modular nuclear power, reflecting a growing interest in nuclear energy among major companies. The interest in small modular reactors stems from increasing energy demands, particularly from data centers, and the challenges of relying solely on renewable energy sources. While renewables are cost-effective, their intermittent nature and grid connection issues limit their viability for continuous power needs. The lengthy and costly construction timelines of large reactors further complicate the situation, making small modular reactors a more appealing option despite their unproven status.

Don't count out Microsoft...it wants to restart and refurbish Three Mile Island reactors as part of its data center energy plans. Key permits are still needed before this is fully in place, though.

Africa Sees Jump in Data Center Construction

A new generation of data centers are being built in Africa&aposs smaller economies, fueling a $5 billion market opportunity on the world&aposs fastest growing continent.... The emergence of local data centers has powered a surge in cloud-based computing in five of Africa&aposs largest economies — South Africa, Nigeria, Kenya, Egypt and Morocco — in the last five years. Now the number of data centers built in smaller African economies is surging as well, with up to $700 million of capital investment pouring in each year in the past two years, according to data from research firm Xalam Analytics which monitors the industry.
Data centers fuel cloud computing in smaller African countries, Semafor, 22-Jun-2023

The rise of data centers in Africa will hopefully solve economic disparities and enhance digital sovereignty on the continent. For instance, improved proximity to data centers will lower transit costs for internet service providers, potentially boosting online economic activity. (Historically, African data has been stored internationally, leading to slower connections and complicating compliance with local privacy laws.) Connectivity — especially below the equator — remains an issue, though; "Google and Meta’s underwater cables up the stakes on internet control"

Although some African countries welcome the new data centers, others are concerned. Chile, for instance, has "multiple groups working to keep Amazon, Google, and Microsoft from doubling the number of centers in the country, fearing environmental devastation". And a a Norwegian ammunition manufacturer blames ‘storage of cat videos’ for threatening its growth.

Speaking of cats...

Mittens' Food Box

We have a temporary third cat, Pickle, in our home that is very food driven. It will bully the other cats away from their food. (Pickle is also known for stealing and eating whole chocolate chip muffins from the breakfast table, too.) So we added one of those microchip-enabled pet doors to this plastic tote so our first cat, Mittens, can eat in peace.

by Peter Murray at January 02, 2025 05:00 AM

Coral Sheldon-Hess

2024 → 2025

I entered 2024 on the brink of burnout. Maybe a bit past the brink. I didn’t feel secure in my job; I was still exhausted from a rushed move at the start of the year; I ignored the fact that I was already busy and tired and went for my Maine Master Gardener Volunteer certification anyway; and, you know, there’s that whole pandemic thing where I was (still am) living the 2020 lifestyle and feeling very alone in that.

2023 background (skip me)

I want to explain the job insecurity, but if you don’t care about that (and you shouldn’t, it’s a downer), please feel free to skip this paragraph! By the start of 2024, I knew my job really well, inasmuch as it was knowable (the boundaries of my role were pretty squishy). I’d done some worthwhile things, impressed some folks, built some strong relationships, and learned how to get information in an organization where news was rarely shared via official channels. I was doing good work that I was proud of and that my stakeholders were happy with. But my position felt precarious because, even though I was hired to work remotely and will probably always need to stay remote, 1) my manager had tried to pull me into in-person work and insisted I should apply for a formal accommodation with HR if I wanted to avoid on-campus work or to be kept safe with any mitigations beyond my own mask during the six campus visits she could require I make per year. I ended up with a written agreement that I was officially 100 percent remote (no required visits at all) for now, but aside from being allowed to mask, I would be unprotected if I ever visited campus (meaning that my manager could require long meetings in small, unventilated, overcrowded rooms with no breaks, and I was not allowed to ask others to mask); also, I had to reapply yearly, with a new doctor’s note each time. The process was dehumanizing and felt like it would inevitably fall through because 2) the associate director of HR who managed my case was visibly unenthusiastic about remote work/employees and also responded mockingly to all of my doctor’s requests for mitigations to improve safety. It’s worth saying: I had the full support of the library’s new director and every coworker I ever spoke to besides the HR rep and my manager (and after our director stated his support, my manager’s opinion also seemed to change). I genuinely believe my director would have done his best for me; but ultimately, I couldn’t know how much power he would have had to overrule HR.

Job news

Given my feelings of precarity and some other issues that arose while I was in that role, I feel incredibly fortunate to have, after a short search, found a new position that feels like a great fit! I was able to give 3+ weeks of notice, staying through the first week of fall semester at my old job, which left them in a good spot for the school year. I took three weeks for myself, and at the beginning of October, I started my position at Colorado State University Libraries as a Developer & Systems Administrator. (Which means I get to build my sysadmin skills! I’d happily accept advice on how to do that!)

I know it’s early to say this with any certainty, but so far, I feel like CSU Libraries have lived up to all of my hopes. Lots of people have hybrid schedules, and two others are fully remote, so meetings are generally online for maximal inclusion. My manager knows about and is cool with my health/disability situation; she also does a great job running a remote/hybrid technology team, including regular informal chats; the two other people with the same title I have(!!) are so smart and skillful but also extremely kind and patient; the whole team is fantastic and helpful and super sharp; the colleagues I’ve met from across the Libraries are delightful; our dean has a technology background (so she understands the value of our work) and is thoughtful about power structures; and there are not only multiple vocal advocates for accessibility, but the library’s leadership puts real time and resources into DEIA efforts. On the technology side, we run our own servers and a lot of infrastructure I haven’t been able to touch in previous positions, which is a cool new challenge. It’s great. So many things I’ve heard other libraries say “can’t be done” are being done, here, and I’m glad to be a part of it! Now I just want to get up to speed as quickly as I can so I’m a productive part of it.

As a bonus, my hours are generally 10:30am-7pm (my choice, they’d have worked with me if I wanted to be more on Eastern time), which fits so much better with my natural circadian rhythm than east coast work ever did. They also believe in flexibility for their workers, so I have room to shift my schedule as needed on individual days (say, to sleep through an afternoon migraine and make up the time in the evening), or potentially to negotiate different starting and ending times on different weekdays if needed. Of course there are scheduled meetings and goals/expectations and all that — it’s a job — but I genuinely haven’t caught any whiffs of the presenteeism you see in so many academic and business institutions.

So, I mean, yeah, I’m probably still burnt out. That doesn’t go away quickly. But I think I’m in a good, sustainable situation, job-wise, which is going to help.

Home and garden news

It also doesn’t hurt that I’ve made it through my Maine Master Gardener Volunteer (MGV) traineeship. Or, well, mostly. I finished all ~40 hours of lessons (October 2023 – March 2024); I’m at 38.25 hours of the 40 I was meant have volunteered in 2024; and my coordinator says they won’t throw me out for not hitting 40 on the dot. (Most MGVs are retired. Not a lot of us try to do it on top of full-time jobs. And I think I’m the only one who does it entirely virtually, so I’m not counting any travel time or anything.) Next year, and every year thereafter that I want to maintain my MGV certification, I only need 20 hours — which still sounds exhausting to me, right this second, but is certainly more achievable than 40 + lessons.

In all that time volunteering and changing jobs and everything, I completely neglected my own yard and garden. The work we paid for in 2023 ended up being pretty bad, alas, and we haven’t really had the heart (or the energy) to clean it all up, beyond filling in the hole after our pond liner floated out of it last winter. (Seriously, the company we hired? Do not recommend, at least not for ponds or lawns, or really trees. The whole thing was a mess, start to finish.) We’ve made a little progress, and I hope to make more—get those elevated beds up so I can grow things—but we’ll see.

In sad news, we said goodbye to our 17 year old chinchilla, Princess Eleanor Rubidium Chinchillington, III (a.k.a. Ella). She had a genetic condition where her teeth grew in both directions, not just up from her jaw, but down into and through it; it’s not something they can do surgery for, so it’s always eventually fatal. We kept her as comfortable as we could, with pain meds and squishy food twice a day, and in her last months of life she enjoyed total run of Dale’s room (since she had stopped chewing on things, she didn’t have to be supervised to be out) and several times broke containment and enjoyed the life of a small, fuzzy criminal, running rampant through the whole house. She passed quietly, while being cradled in a blanket by her favorite person in the world.

Our other pets—Phoebe, Pumpkin, Hermann, and Newton—are mostly doing well. Pumpkin was named Chubby Bird of the Day (Facebook link) back in April. Phoebe is a very old man who needs pain meds for his foot each day and sleeps more than he’s awake, but he toddles around to where he wants to go and makes happy beak noises every day. Pumpkin checks on him first thing every morning, which is adorable, and (sometimes) yells at us when he knows Phoebe needs us for something. (He also yells for no reason, so. Maybe we’re projecting intention, here.) And the budgies annoy them both whenever they’re all out.

Other notable things that happened this year: watching a rescued baby seal being released back into the ocean; traveling just a couple of hours away to see the eclipse in totality; speaking at Solstice School; and experiencing the best aurora borealis of either of our lives, ever, including when we lived in Alaska (that’s the hero image for this post); and making an excellent crocheted “Medusa” hat with cartoony snakes for Halloween. By several measures, it was a good year.

2025 plans

Learning

I’ve signed up for online American Sign Language classes. (The organization is called “Queer ASL,” and the courses center 2SLGBTQIA+ people and experiences, but allies are welcome and invited to take the courses too!) Just as the herbal classes I took online in 2024 (the Zoom sessions of Wild Cherries Year 2: Racemes of Delight; they’re in the Pittsburgh area, so I couldn’t join anything in-person) didn’t add to my feelings of exhaustion, I’m finding that studying ASL is also energizing rather than draining. It uses a different part of my brain than work or chores, and (as I might talk about in another post coming up) it feels like one of the “little good things” I can do that might make the world a better and more inclusive place in the coming years.

I’m doing a self-paced training on herbs for chronic illness with an herbalist I respect, but because it’s entirely self-paced, I’m doing it painfully slowly. (There’s a pun there, probably.) I’d better finish that in 2025; I think I only have access for a year?

And in terms of work-related trainings, CSU paid for access to Practical Accessibility for each Dev/Sysadmin (and for our new User Experience Professional!), and I made good progress on it during my first three months on the job. I’ll finish that in 2024, and then I’ll start looking for Linux / Systems Administration / Cybersecurity training.

Community

A thing I love about the Stonefruit and Wild Cherries community in Pittsburgh—and probably part of the reason those classes were more soothing than tiring—is that the people involved are so caring for and careful with each other. In addition to a number of other caring behaviors, they kept each other safe by holding their in-person classes outdoors, requiring a negative COVID-19 test before arrival, and everyone masking while indoors. I would love to find such a caring community up here in Maine, enough so that the temptation to start one keeps growing. (Hear me out: there’s the Feminist Bird Club, which has no chapters here, and also Birdability, which has no captains I’m aware of here; a hybrid of the two would be a great addition, right? Our local Audubon chapter already offers accessible birding events, so I bet they’d be supportive.)

If I don’t find (or create) any kind of formal community, I will at least spend more time hosting friends, I hope. We lit our fire pit on December 21, and a couple of friends stopped by. It was much too cold to be outside, but the company was great! We’ll hopefully do more of that, during more pleasant weather.

I also plan to be more intentional with my time, cutting down on social media and replacing the “ambient news” part of it with RSS feeds and the socialization part by scheduling time to talk with people, or emailing them, or writing letters. I found an RSS reader that will let me follow the Facebook pages my town uses for official news (I wish I were joking), so that’s one problem solved. The hope is that I’ll gain back some time for reading, gardening, and other things that benefit my mind and spirit more than scrolling — or if I don’t gain back time, because I’ve scheduled so much for socializing, then at least the connections I have with others will be more meaningful.

Etc.

I won’t belabor this point, but I am expecting a lot of turbulence in the coming years. I’m proud of us for getting through 2024, and I hope we will each do our part to get as many of us through 2025 as possible. Keeping mind, body, and spirit together can be a lot, and when we manage it, it’s something to celebrate.

My motto going into 2025 is the same as it’s been since 2017 or so: brace for tough times, but don’t let go of hope. Make the world better in whatever small ways you can.

by Coral Sheldon-Hess at January 02, 2025 02:14 AM

January 01, 2025

Digital Library Federation

DLF Digest: January 2025

A monthly round-up of news, upcoming working group meetings and events, and CLIR program updates from the Digital Library Federation. See all past Digests here

 

Happy New Year! We hope everyone is staying healthy throughout the winter. 

— Team DLF

This month’s news:

This month’s open DLF group meetings:

For the most up-to-date schedule of DLF group meetings and events (plus NDSA meetings, conferences, and more), bookmark the DLF Community Calendar. Meeting dates are subject to change. Can’t find meeting call-in information? Email us at info@diglib.org. Reminder: Team DLF working days are Monday through Thursday.

DLF groups are open to ALL, regardless of whether or not you’re affiliated with a DLF member organization. Learn more about our working groups on our website. Interested in scheduling an upcoming working group call or reviving a past group? Check out the DLF Organizer’s Toolkit. As always, feel free to get in touch at info@diglib.org

Get Involved / Connect with Us

Below are some ways to stay connected with us and the digital library community: 

The post DLF Digest: January 2025 appeared first on DLF.

by arubin at January 01, 2025 02:32 PM

Cynthia Ng

2024 Blog Year in Review + Hosting Provider Change

A short announcement attached to this year’s blog review post. Move off of WP.com to DreamHost I posted a while back about moving off of WordPress.com, and I’ve finally done the official switch over. It took a while to get the domain transferred, and then it was close enough to the end of the year … Continue reading "2024 Blog Year in Review + Hosting Provider Change"

by Cynthia Ng at January 01, 2025 08:17 AM

Hugh Rundle

So you want to study Library and Information Science in Australia?

I've had a couple of people ask for advice on LIS courses recently, so I thought it might be useful to write up some thoughts as a blog post in case anyone else has similar questions. To situate my perspective: I completed a Graduate Diploma in Information Management at RMIT University in 2003. I've worked in public libraries, for the library-owned cooperative CAVAL, and in an academic library. In that time I have supervised placements for both diploma and higher degree students. I currently work at a university but it's more than two decades since I received my librarianship qualification so I'm not across what is taught in the LIS curriculum now.

This post is primarily for people intending a career in libraries, but most of it applies if you're thinking of working as an archivist or records manager.

Is LIS for you?

As soon as you start talking to people about a possible new career in libraries or archives you will hear misinformation. Many people – including those in the industry already – have strong assumptions about what librarians and archivists do, and what their career prospects are likely to be, without necessarily having subjected those opinions to any rigorous testing. If anyone tells you they know what the career prospects for librarians will be in five or ten years, don't believe them. Nobody knows – including the university recruiters who will tell you whatever they think you need to hear to enrol.

No particular "personality type" suits library work – it's a broad field and there's something for everyone. If you love talking to people and working in a big team you could have a great time in public libraries or in academic library liaison work. If you love quiet space and getting into the detail of things, metadata or systems work could be for you. Thrive on short deadlines and pride yourself on your thoroughness? Maybe a career in legal or medical libraries awaits. School libraries aren't for you if you don't like children or teaching, but a corporate library career could be just the thing. But these are broad strokes – the truth is that all these roles and libraries need people with many different skills and inclinations, and sometimes require people to work against their skill set. Technical specialists need to be able to communicate effectively with those working in client-facing roles. Public library staff often need to be able to pull off an entertaining storytime on the same day they help someone with a local history query. Staff answering reference queries won't get far if they don't understand how library metadata systems work.

Contrary to popular myth, you won't get paid to read books all day, and outside of some niche library and archive roles you won't even spend much time actively helping researchers to find relevant books and papers for their latest project. But you certainly might enjoy a rewarding career with interesting challenges. If you love sending and receiving email, LIS could also be a great choice.

Do you need a qualification?

You may not need an additional qualification to get started in LIS if you already have qualifications and experience in a related field – and you might be surprised by what is related. Don't assume you need to already have your library qualification before you can find a paid role in the industry. If you have qualifications and experience in education, computer science or information technology, publishing, law, health and medical sciences, or academic research, you may be able to get a start in a professional-level role straight away. Having said that, it greatly depends on both the role and the attitude of the employer - so you should always check before spending time applying. You also will have very limited opportunities without a formal LIS qualification.

Choosing a degree

A formal LIS qualification is helpful (and often required), but working out which qualification you need can be confusing. The four types of LIS qualification are:

These courses all include a compulsory work placement.

After a period of consolidation there are now only three universities offering degrees - Charles Sturt University, Curtin University, and the University of South Australia. Diplomas are available from various TAFEs in New South Wales, Victoria, South Australia, Western Australia, Queensland, and Fiji. Courses you may see at Monash and RMIT are being taught out and are not accepting new enrolments.

If you do not already hold a university degree

Generally speaking, to gain a Diploma level qualification you study at TAFE rather than university. This has two big advantages - greater access to courses, and lower fees. A Diploma of Library and Information Services takes one year and costs around $12,600 for a government-funded place. A Bachelor degree would take three years and cost around $16,000 per year. A Diploma will generally give you credit towards a later Bachelor degree, so it's a common first step for people who aren't sure they want to fully commit, or simply can't afford to spend three years studying full time.

However, there's a catch - a Diploma qualifies you to be a "Library Technician" rather than a librarian. This is what is sometimes referred to as a "para-professional" role. My personal view is that these roles perpetuate a class divide within the profession and are used to underpay highly skilled professionals. The important thing you need to know is that if you hold only a Diploma you will not be eligible to apply for most qualified librarian positions. When you start looking for library jobs you may notice some position descriptions require applicants to "hold a degree conferring eligibility for Associate Membership of ALIA" or similar wording. This means Bachelor and higher degrees - a Diploma only confers eligibility for "Library Technician" or "General" membership. The Bachelor of Information Studies is available through Charles Sturt University. If you do not already hold a university degree in any other discipline, and you're sure you want to become a fully qualified librarian or archivist, this could be a good option. Opting for a Diploma and a career as a Library Technician can also be a reasonable choice - just make sure you know what you're getting into.

If you already hold a university degree in any discipline

If you hold a university degree in any discipline, you have more options. The most common path is the (one year full time, two years part-time) Graduate Diploma. Some people choose to complete a full Masters degree, which takes another semester, or two if you enrol in Curtin's "extended" degree. You can usually "upgrade" from a Grad Dip or "exit early" from a Masters course with a Graduate Diploma, but you should check this before enrolling.

The advantage of the Grad Dip is that you end up with a qualification as a librarian in half the time. Why then, would you complete a Masters? Firstly, a Masters allows you to explore more options - most of the Masters courses qualify graduates as a librarian an archivist, or a records manager, depending on which electives and specialisations they have chosen. This is useful if you think you might be interested in either career path but need to explore a bit more to decide which one. The second reason is that the Masters degree is more widely recognised internationally. Specifically, in the United States generally only holders of a Masters degree are recognised as being fully qualified librarians. If you're considering working in the USA at some point, a Masters qualification will make things easier.

The hardest of the hardcore enrol in a Master of Education (Teacher Librarianship). Look forward to an exciting career keeping up to date with two different sets of professional knowledge simultaneously, whilst embodying the enduring mental picture of what a librarian is for generations of children. No pressure!

You may not be able to secure a Commonwealth-Supported place, for a Graduate Diploma or Masters, so they can be fairly expensive. If you're an Australian citizen you are still eligible for a FEE-HELP (HECS) loan.

Making the most of being a student

Students are seen as the future of the GLAM professions, so professional organisations offer generous membership rates and opportunities. Make sure you take advantage of these, because they will not last beyond your degree. Attending events, meeting people and reading reports and papers are all excellent ways to develop your knowledge and professional networks even if you are not yet working in the sector.

All GLAM professional organisations offer a heavy discount for student members:

Or to put it a different way, you could join all of the above organisations as a student member for $225 a year – less than the cost of joining one of ALIA, ASA, or RIMPA as a professional member.

Joining these organisations gives you cheap or free access to events like webinars, workshops and meet-ups, as well as some training, PD and professional literature access. Some conferences also offer free tickets to student volunteers, and heavily discounted tickets for students. Most importantly, you'll be able to meet people working in the industry and build connections to future colleagues and employers.

Work Placements

Every degree includes a compulsory work placement - usually two weeks, sometimes three. The quality of your placement experience may vary depending on how seriously the host institution takes placements, the time of year, your needs, and how well matched you are. The first thing you need to be aware of is that LIS placements are always unpaid, and usually your educational institution will not let you do your placement in your existing workplace if you already have a library job. You might think that if the point is to give you real-world experience and you already work in a library, you should be able to just get credit for that. You'd be right, but that's not how it works.

Whether you already have some kind of paid or volunteer library job or not, the work placement is a good opportunity to see what things look like in a real workplace that you're unfamiliar with. If you know all about storytime at the public library, consider getting a placement in a research archive just to see if you like it. If you're sure you want to be a metadata specialist, consider asking to be placed with an outreach and information literacy team. Lots of students surprise themselves partway through their degree and end up on a different career trajectory to the one they expected – placements are a great opportunity to try something new.

Some institutions expect the students to organise their own placements, or at least to identify somewhere they would like to be placed. Hosting student professional placements can be resource-intensive for the host institution. Whilst in theory students are contributing to real work, host institutions have to organise some onboarding, supervision, and usually some kind of special project that can be completed part time in two weeks. This is a lot of work for us! You can make things easier by:

Preparing for a LIS career

As well as studying hard, joining and participating in professional organisations, and preparing well for your placement, there are some other things you can do to prepare for a LIS career. Try to keep a cool head – you're going to receive a lot of advice, both solicited and unsolicited, about the best way you should or should not build a professional profile. The most important thing is to choose something that works for you and feels reasonably natural. Some people embrace social media and LinkedIn. Others never post online but join committees and volunteer at conferences. Others create a profile by writing articles or blogs.

Given we're information professionals, I usually encourage LIS students and graduates to register a personal domain name and create at least a basic personal website where you can post a professional biography and links to publications, social media and so on. Prospective employers will often do a web search for you, and it's good to consider which site you want to appear at the top of the list.

The other thing you need to keep in mind when meeting people is that it is a very small industry. Chances are high that you will meet these people again - at a conference, in an office, or across an interview desk. Try to remember this before you launch into your newly-formed opinion about a library service or professional colleague. You might be right, but it might not be what you want to be remembered for. I'm telling you this as a highly-opinionated friend.

Lastly: be careful about volunteer work. There's nothing wrong with volunteering your time for public service, as long as you enjoy the work and/or spending time with the other volunteers. It is easy, however, for volunteering to tip into exploitation of the volunteers and be used as a pressure point against paid workers. If you have no experience in the industry and want to "get your foot in the door", it's hard to avoid doing unpaid work, but take a broad view and consider who is getting the most value out of your experience. Every "volunteer" position that requires the same skills and knowledge as a paid role is one fewer of the paid positions you are hoping to secure.


Glossary

More information

Updated with corrections 1 January 2025.


by Hugh Rundle at January 01, 2025 12:00 AM

December 31, 2024

Lucidworks

How AI Is Revolutionizing The Manufacturing Industry

AI revolutionizes manufacturing with predictive maintenance, automation, and data-driven insights, enhancing efficiency and innovation.

The post How AI Is Revolutionizing The Manufacturing Industry appeared first on Lucidworks.

by Patrick Romanus at December 31, 2024 08:38 PM

John Mark Ockerbloom

Let us sing a song of cheer again

The depths of the Depression would seem an unpromising time to revive the 1929 song “Happy Days Are Here Again”. But after it was played at the 1932 Democratic convention, it caught on as a song of hope for better things to come. And after years of work and struggle, prosperity returned.

Tomorrow this song and the rest of our works join the public domain. And while tough times may still threaten, with work, struggle, and hope we too may bring happy days here again.

by John Mark Ockerbloom at December 31, 2024 05:56 PM

David Rosenthal

Self-Own

Credit: XKCD
The Cambridge dictionary defines the verb to bullshit as:
a rude word meaning to try to persuade someone or make them admire you by saying things that are not true
The essence of successful bullshit is that it should be both plausible and presented authoritatively. Bullshitters are always tempted to buttress the appearance of authority by including actual evidence rather than just their interpretation of the evidence, but this is often a fatal mistake. Below the fold I discuss a classic example from MAGA's campaign to demonize immigrants.

@GabeGuidarini tweeted:
𝐓𝐡𝐢𝐬 𝐢𝐬 𝐭𝐡𝐞 𝐦𝐨𝐬𝐭 𝐢𝐦𝐩𝐨𝐫𝐭𝐚𝐧𝐭 𝐩𝐨𝐬𝐭 𝐲𝐨𝐮’𝐥𝐥 𝐞𝐯𝐞𝐫 𝐫𝐞𝐚𝐝 𝐨𝐧 𝐦𝐚𝐬𝐬 𝐢𝐦𝐦𝐢𝐠𝐫𝐚𝐭𝐢𝐨𝐧 𝐢𝐧𝐭𝐨 𝐭𝐡𝐞 𝐔𝐧𝐢𝐭𝐞𝐝 𝐒𝐭𝐚𝐭𝐞𝐬:

In 1924, President Calvin Coolidge signed the Johnson-Reed Immigration Act, which halted mass immigration into the United States.

The act completely stopped immigration from Asia, and set strict quotas on immigration from other places including Europe.

Wages began to dramatically increase.

By the 1950s, America reached the peak of its economic and industrial might. Economic inequality was low. Large businesses and workers alike were prospering.

Then, in 1965 the Hart-Cellar Immigration Act was signed by President LBJ, opening the floodgates and allowing for massive unchecked immigration, especially from the global south.

Immediately, income growth and wage growth for the bottom 90% of earners came to an abrupt halt.

Soon after, the incomes for the top 1% of earners skyrocketed, leading to the massive wealth inequality in America we see today.

The political left attributes this to simply a lack of proper taxation and regulation, but that’s mainly not the case.

Mass immigration has allowed for modern scab labor to become the norm, allowing large companies to pay lower wages for foreign migrants who demand less.

The loser? America’s middle class, which has been deteriorating for more than half a century now.

When you see wealthy technocrats argue in favor of mass “skilled” immigration, keep this in mind.
Who is GabeGuidarini?
Ohio Field Rep @TPAction_. Comms @OhioCRs. President @UDRepublicans. Fmr Acting President @uscollegegop.
He is a Republican operative. His chain of causation sounds convincing, doesn't it?

But he made the bullshitter's fatal mistake. He included this graph. Lets annotate it. First we have Guidarini's claim that:
In 1924, President Calvin Coolidge signed the Johnson-Reed Immigration Act, which halted mass immigration into the United States.

The act completely stopped immigration from Asia, and set strict quotas on immigration from other places including Europe.

Wages began to dramatically increase.
Lets put 1924 on the graph.

Whose wages had the dramatic increase in 1924? Right, the 1%. Whose wages were flat from 1924 until the Great Depression? Right, the bottom 90%. Guidarini is just wrong. The dramatic increase in the wages of the bottom 90% happened in 1940, 16 years later. Can you think of a cause for the dramatic increase in 1940? It surely wasn't the Johnson-Reed Immigration Act of 1924.

Second, we have Guidarini's claim that:
By the 1950s, America reached the peak of its economic and industrial might.
If America's "economic and industrial might" peaked in the 50s, why did the incomes of the bottom 90% continue rising until 1973?

Third we have Guidarini's claim that:
Then, in 1965 the Hart-Cellar Immigration Act was signed by President LBJ, opening the floodgates and allowing for massive unchecked immigration, especially from the global south.

Immediately, income growth and wage growth for the bottom 90% of earners came to an abrupt halt.
Lets put 1965 on the graph.


Can you see the income growth for the bottom 90% come to an immediate halt? No, wages continued rising as they had been since 1940 until 1973, another 8 years. Did the Hart-Cellar Immigration Act of 1965 cause wages to flatten in 1973? Whose income flat-lined after 1965 for a decade and a half? Right, the 1%.

Lastly, we have Guidarini's claim that soon after 1965:
the incomes for the top 1% of earners skyrocketed, leading to the massive wealth inequality in America we see today.
Can you see when the "incomes for the top 1% of earners skyrocketed"? That's right, it was in 1985, so "soon" means 20 years! And can you guess who was President of the US, and from what party, then? Right, it was a certain Ronald Reagan, a Republican.

by David. (noreply@blogger.com) at December 31, 2024 04:00 PM

Peter Murray

One Year of Learning 2024

Inspired by Tom Whitwell's 52 things I learned in 2022, I started my own list of things I learned in 2023. Reaching the end of another year, it is time for Things I Learned In 2024:

  1. Some jurisdictions use "day fines"—or fining an offender based on that person's daily personal income. The number of days would be scaled to the seriousness of the offense. Day Fine, Wikipedia
  2. There are over twice as many federally-recognized Indian tribes as there are countries in the United Nations. The 574 Federally Recognized Indian Tribes in the United States, Congressional Research Service
  3. Crayons were invented in Sandusky, Ohio, in 1902 by a school teacher and his brother experimenting with adding waxes to chalk. How one Ohio town once claimed the title of ‘color capital of the world’, The Ohio Newsroom
  4. “Schrödinger’s cat”—the thought experiment in which a cat in a box can be considered both alive and dead—was first published in a scientific journal in 1935. It didn't enter the popular imagination until Ursula K Le Guin, a science fiction author, published a short story in 1974. Ursula Le Guin: the pioneering author we should thank for popularizing Schrödinger’s cat, Quantum Magazine.
  5. The U.S. Air Force has a facility in New York where it mounts its aircraft upside-down to test radio emissions. The Fascinating Story Of The USAF’s “Upside-Down Air Force”, The War Zone
  6. The largest energy consumer in California is a pumping station that raises water 2,000 feet (600 meters) to cross the Tehachapi Mountains at the southern end of the state. At full capacity, the station moves 2 million gallons a minute for agriculture and drinking. How Infrastructure Works: Inside the Systems That Shape Our World, by Deb Chachra.
  7. Interlibrary loan was first conceived by Alexandre Vattemare, a French ventriloquist who inspired the founding of Boston Public Library. Alexandre Vattemare on Wikipedia via Camwyn on Mastodon and Mike Taylor.
  8. In 1959, a cement mixer's bucket was left behind on an Oklahoma rural road. In 2011, an artist couple turned it into a space capsule. The Cement Mixer Space Capsule of Winganon, Amusing Planet
  9. Atomic clocks built for use on Earth will run faster on the moon, necessitating the need for "Lunar Coordinate Time" to support navigation and scientific research on the moon. What Time Is It on the Moon?, National Institute of Standards and Technology
  10. A Charlie Brown Thanksgiving premiered in Canada on Oct 6, 1973 — six weeks before it premiered in the United States. Nat Gerler on Bluesky
  11. The first virtual meeting was in 1916 between members of the American Institute of Electrical Engineers — 5,000 attendees in eight cities (and 95 years before Zoom was founded). The First Virtual Meeting Was in 1916: The amazing feat linked up 5,100 engineers from Atlanta to San Francisco, IEEE Spectrum, 13-Nov-2024
  12. The cumulative land area of China, the United States, India, Mexico, Peru, and Europe still isn't enough to match the African continent. Somalia, Japan, and New Zealand are all approximately the same size as the US East Coast. See this and more shown with maps!
  13. Debit card and can card transactions were nearly identical in dollar amounts ($4.55 trillion versus $4.88 trillion) in 2021, but there were twice as many debit transactions as credit transactions (106 billion versus 51 billion). Credit Card Swipe Fees and Routing Restrictions, Congressional Research Service, 8-Oct-2024

Other lists:

by Peter Murray at December 31, 2024 03:35 PM

John Mark Ockerbloom

It’s a dead man’s party

The Disney studio had a productive year in 1929. Along with releasing 12 new Mickey Mouse cartoons, it began a series of one-shot musical cartoons with animation designed to fit the music, instead of the other away around. The “Silly Symphony” series began with “The Skeleton Dance”, a creepy graveyard cartoon set to music Carl Stalling wrote after Disney couldn’t get rights to Saint-Saëns’ Danse Macabre. Watchable online now, it rises to the public domain in 2 days.

by John Mark Ockerbloom at December 31, 2024 12:23 AM

December 30, 2024

Open Knowledge Foundation

Transforming Nepal’s Data Ecosystem: 2024 in Review

2024 marked a significant year for Open Knowledge Nepal (OKN) as we strengthened data-driven governance and fostered a culture of open data and collaboration across Nepal. As the year comes to a close, we reflect on our major initiatives and achievements that paved the way for a more transparent and innovative data ecosystem.

Integrated Data Management System (IDMS)

This year, OKN addressed the persistent challenges of data silos and fragmented datasets in Nepal’s local governments through the Integrated Data Management System (IDMS) for Local Government project. Supported by The Asia Foundation’s Data for Development (D4D) Programme, this initiative provided comprehensive technical and non-technical support to enhance data utilisation at the local level.

In its fourth phase, the project emphasised data-driven decision-making, empowering municipalities with localised data management systems and knowledge. It is currently implemented in five municipalities – Birgunj Metropolitan City, Tulsipur Sub-Metropolitan City, Janakpurdham Sub-Metropolitan City, Lekbeshi Municipality, and Suddhodhan Rural Municipality – laying a strong foundation for sustainable data practices and improved governance.

Data Hackdays

As part of the Women in Data Conference 2024, OKN and The Algorithm organised Data Hackdays in Tulsipur and Birgunj on August 25 and September 1. These events brought together local governments and community members to explore the potential of data through the lens of the IDMS. Participants developed data stories addressing critical themes like Women’s Statistics, Health, Environment, and Education.

These hackdays highlighted the importance of evidence-based decision-making while fostering collaboration between local governments and communities. They showcased the growing involvement of women in data, emphasising the need for sustained efforts to enhance data literacy and public engagement.

Open Data Nepal

Since its launch in 2018, Open Data Nepal has worked to make Nepal’s data permanently accessible online. In 2024, OKN initiated a comprehensive revamp of the portal to address fragmented and non-machine-readable datasets. The updated version, set to launch in 2025, will feature improved accessibility, user-friendly design, and advanced tools for data visualisation and exploration, further empowering citizens, researchers, and developers.

Local Government Data Profile

In 2024, OKN leveraged the IDMS to develop a sustainable Local Government Data Profile (LG Profile). This initiative aims to streamline data processes, enhance decision-making, and set a benchmark for data-driven governance. Addressing challenges like capacity gaps and vendor-locked systems, the LG Profile emphasises scalability, interoperability, and greater data ownership by local governments.
Currently being implemented at Tulsipur Sub-Metropolitan City, the LG Profile showcases how municipalities can adopt innovative, sustainable data practices to strengthen governance and resource allocation.

Events and Engagements

2024 was a vibrant year for OKN, filled with impactful events and collaborations aimed at promoting open data, inclusivity, and capacity building.

Collaborations and Partnerships

In 2024, OKN collaborated with leading organisations, including the Open Knowledge Foundation, The Asia Foundation’s D4D Programme, the Women in Data Steering Committee, Accountability Lab Nepal, and local governments. These partnerships strengthened Nepal’s open data ecosystem, fostering innovation and transparency.

Looking Ahead to 2025

As we close 2024, we extend heartfelt gratitude to our partners, collaborators, and the community for their unwavering support.

2025 promises to be a year of growth and impact, with key events like the PublicBodies Datathon, Women in Data Conference, and Open Data Day(s) on the horizon. We will also continue advancing projects like IDMS, Open Data Nepal, and the LG Profile, driving innovation in Nepal’s data ecosystem.

Expanding our reach, we look forward to launching regional projects focusing on the Asia region, aligning with the Open Knowledge Foundation’s strategic focus on The Tech We Want.

Here’s to a collaborative, sustainable, and open 2025!

by Open Knowledge Nepal at December 30, 2024 03:56 PM

December 29, 2024

John Mark Ockerbloom

Comfortable neutrality is not an option

“I was so happy. I was so safe,” laments Lois Farquar to a suitor late in Elizabeth Bowen’s The Last September. But from the book’s start, as she and her fellow Anglo-Irish gentry enjoy parties and dances, their Irish neighbors are fighting for independence from Britain, while they entertain British soldiers sent to suppress the rebellion. Unable to commit politically or romantically, Lois and her family lose much. Bowen’s novel joins the US public domain in 3 days.

by John Mark Ockerbloom at December 29, 2024 10:31 PM

December 28, 2024

John Mark Ockerbloom

A pioneering American graphic novel

Lynd Ward’s Gods’ Man is a novel without words (apart from chapter titles) about an artist who makes a Faustian bargain with a masked stranger for artistic success. Told in 139 woodcuts, it was the first of 6 wordless novels by Ward, and the first American novel of its kind. Selling well when published in 1929, it influenced artists like Art Spiegelman and Will Eisner, who made graphic novels a genre of widespread ongoing interest. The public domain claims it in 4 days.

by John Mark Ockerbloom at December 28, 2024 05:18 PM

Making a scene

Elmer Rice’s boisterous Street Scene wasn’t easily staged. Though set in front of a single tenement, it required over 30 actors, prompting many producers to turn it down. Rice eventually had to direct the first production himself. But that had over 600 performances on Broadway, won the 1929 Pulitzer Prize, and was later adapted into a film and an opera. A 2013 production staged it in the open on an actual New York street. The play opens in the public domain in 5 days.

by John Mark Ockerbloom at December 28, 2024 12:14 AM

December 26, 2024

John Mark Ockerbloom

“A classic in the field of Jewish music”

In 2010 Israel Katz called Abraham Zevi Idelsohn “the undisputed pioneer-scholar of Jewish music”. Part of Idelsohn’s claim to fame is his comprehensive survey Jewish Music in its Historical Development, written while he was cataloging the Eduard Birnbaum Collection of Jewish Music at Hebrew Union College. Covering Jewish music and its various influences from Biblical times to the early 20th century, the book was published in 1929, and joins the public domain in 6 days.

by John Mark Ockerbloom at December 26, 2024 03:04 PM

December 25, 2024

John Mark Ockerbloom

“God’s glory and my country’s shame”

In the midst a “cruel land, this South”, Christ makes an unexpected sort of appearance in Countee Cullen‘s long poem “The Black Christ”. It’s one of a number of poems in The Black Christ and Other Poems dealing with faith, injustice, sin, racial violence, and African American experience, among other themes.

The University of Missouri libraries has an exhibit of pages from the book, illustrated by Charles Cullen. The complete book comes to the public domain in 7 days.

by John Mark Ockerbloom at December 25, 2024 09:33 PM

December 24, 2024

John Mark Ockerbloom

There’s a song in the air

Most Christmas songs Americans are used to hearing on the radio were published after 1929. But most Christmas songs they’re used to singing in church are older. Much of the traditional American repertoire is in George Rittenhouse’s World Famous Christmas Songs, with 74 carols and nativity songs “specially arranged for popular usage in community caroling, school, chorus, church and home”. First published in 1929, and reissued in 1957, it’s in the public domain in 8 days.

by John Mark Ockerbloom at December 24, 2024 08:39 PM

Nick Ruest

Exploring 🫡 Twitter Data

Overview

The dataset was collected with Documenting the Now’s twarc using version twarc2 via the Academic Access v2 Endpoint. It contains 22,919,247 Tweets with the search term 🫡 from October 28, 2022 through February 18, 2023 (missing October 31, November 15, November 18, December 31, 2022, and January 7, 2023).

I wrote this back in April 2023:

Why not grab as many 🫡 emoji tweets while the platform is on fire? Seems like a fitting final use of Twitter Academic Research Access. I should probably just make that a Wayback link preemptively, eh? Who knows how long that page will be there at this rate!

Anyway, I was curious if there would be a big spike in the emoji’s usage on a few days during the hardcore timeline. After pulling the data, and plotting it. Doesn’t really look like it’s the case.

Here is the old Tweet Volume graph.

🫡 tweet volume 🫡 Tweet Volume

Here is a new one, which really exposes a misunderstanding I had in harvesting tweets with twarc2 via date ranges. This is also illustrated in the above chart if you look closely. 🤦

saluting-face-2022-10-28-2022-10-31.dat
saluting-face-2022-11-01-2022-11-15.dat
saluting-face-2022-11-16-2022-11-18.dat
saluting-face-2022-11-19-2022-12-31.dat
saluting-face-2023-01-01-2023-01-07.dat
saluting-face-2023-01-08-2023-02-19.dat
🫡 tweet volume 🫡 Tweet Volume
🫡 wordcloud 🫡 wordcloud

Top languages

Using the full line-oriented JSON dataset and twut:

import io.archivesunleashed._

val tweets = "saluting_face.jsonl
val df = spark.read.json(tweets)

df.language
  .groupBy("lang")
  .count()
  .orderBy(col("count").desc)
  .show(10)
+----+-------+
|lang|  count|
+----+-------+
|  en|9244552|
|  ja|5781358|
| und|2211360|
|  es|1090225|
|  pt| 973493|
|  ar| 726801|
|  ko| 473167|
|  fr| 351251|
|  in| 327702|
|  th| 290914|
+----+-------+

Top tweeters

Using saluting_face-user-info.csv from df.userInfo.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("saluting_face-user-info"), and pandas:

🫡 Top Tweeters 🫡 Top Tweeters

Retweets

Using the full line-oriented JSON dataset and twut:

import io.archivesunleashed._

val tweets = "saluting_face.jsonl
val df = spark.read.json(tweets)

df.mostRetweeted.show(10, false)
+-------------------+-----------------+                                         
|tweet_id           |max_retweet_count|
+-------------------+-----------------+
|1603224385477054465|147274           |
|1560231432207106048|86855            |
|1604556216889327617|76775            |
|1553755497014013953|71583            |
|1600843786640203783|66141            |
|1568676995743248385|47865            |
|1536619482281820160|43970            |
|1583007717253591041|38834            |
|1534770573804707840|38258            |
|1602561118056378368|37725            |
+-------------------+-----------------+

From there, we can use append the tweet ID to https://twitter.com/i/status/ to see the tweet. Here’s the top three:

  1. 147,274

  2. 86,855

  3. 76,775

Top Hashtags

Using saluting_face-hashtags.csv from df.hashtags.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("saluting_face-hashtags"), and pandas:

🫡 hashtags 🫡 hashtags

방탄소년단 is BTS written in Hangul.

Top URLs

Using the full line-oriented JSON dataset and twut:

import io.archivesunleashed._

val tweets = "saluting_face.jsonl
val df = spark.read.json(tweets)

df.urls
  .groupBy("url")
  .count()
  .orderBy(col("count").desc)
  .show(10, false)
+----------------------------------------------------------------------+------+
|url                                                                   |count |
+----------------------------------------------------------------------+------+
|https://youtu.be/L-orDkbsuHk                                          |129146|
|https://t.co/kkrMEH3Xjo                                               |128719|
|https://twitter.com/ENHYPEN_members/status/1600843786640203783/photo/1|52397 |
|https://t.co/rJ8a1ygDru                                               |52396 |
|https://t.co/F0tQ8nIOOZ                                               |33000 |
|https://twitter.com/WayV_official/status/1613778127846801408/photo/1  |33000 |
|https://twitter.com/btsinthemoment/status/1602561118056378368/video/1 |31838 |
|https://t.co/Nr45zhQBNo                                               |31248 |
|https://twitter.com/claricetudor_/status/1613609828340858919/photo/1  |30944 |
|https://t.co/TwIkLBAYWA                                               |30944 |
+----------------------------------------------------------------------+------+

Top media urls

Using the full line-oriented JSON dataset and twut:

import io.archivesunleashed._

val tweets = "saluting_face.jsonl
val df = spark.read.json(tweets)
  
df.mediaUrls
  .filter(col("media_url").isNotNull)
  .groupBy("media_url")
  .count()
  .orderBy(col("count").desc)
  .show(10, false)  
+-----------------------------------------------+-----+
|media_url                                      |count|
+-----------------------------------------------+-----+
|https://pbs.twimg.com/media/FjdW7uqUoAED64t.jpg|52397|
|https://pbs.twimg.com/media/Fl-EBDrWYAADM8o.jpg|25855|
|https://pbs.twimg.com/media/FhFGxc_akAAz5p0.png|15859|
|https://pbs.twimg.com/media/FkTWga5WAAAOEqf.jpg|15529|
|https://pbs.twimg.com/media/FmSxlMFWAA095lk.jpg|15473|
|https://pbs.twimg.com/media/FmSxlMPWAAkVM_H.jpg|15473|
|https://pbs.twimg.com/media/FhAKydlakAAnYp2.jpg|14314|
|https://pbs.twimg.com/media/FjI8sOPacAAilEE.jpg|13295|
|https://pbs.twimg.com/media/FktSKyxXkAEvFyV.jpg|11956|
|https://pbs.twimg.com/media/FmVKpX9aMAENckX.jpg|11000|
+-----------------------------------------------+-----+
🫡 Top Media 🫡 Top Media

Emotion score

Using saluting_face-text.csv from df.text.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("saluting_face-text"), Polars, and j-hartmann/emotion-english-distilroberta-base:

Emotion Distribution in 🫡 Tweets Emotion Distribution in 🫡 Tweets

December 24, 2024 05:00 AM

December 23, 2024

Terry Reese

MarcEdit 7.7 update

With any luck, this post will show up sometime on the 12/23 and I’ll have successfully figured out how to schedule posts :).

First — I hope to everyone that celebrates, they are having a very happiest of holidays. If weather, travel, etc. goes well, we’ll be celebrating with family and enjoying a little bit of downtime with people that we love and care about. I hope that however you chose to spend the time, it’s merry.

On to MarcEdit. A couple of update notes. First, related to the MAC version of the application. After updating MarcEdit 7.7 to the new .NET core framework (.NET Core 8.0 LTS) — I switched to my mac to make changes and recompile the code, targeting the new framework…and I was in for a surprise. I had not been paying attention, but Microsoft has essentially ended development using Visual Studio on Mac systems, and has transitioned to using Visual Code. That isn’t so much a big deal — Visual Code has a great extension and can easily be used to write C# applications. The bigger issue was that the previous UI methodology I was using wasn’t being ported forward to support .NET Core 8. Since all of the current application now uses that framework — that is a problem. That leaves me with a couple of options:

  1. Stop supporting apple devices (not really something I’d like to do)
  2. Rewrite the entire application using Microsoft’s new MAUI framework (or something similar)
  3. Look for another option

I have decided to look for another option — because while there aren’t a lot of Mac downloads, there is enough that I’d like to keep the option open. So, I did a little research and testing, and for now, what I’ve done is created a process where you can run MarcEdit 7.7 via Wine on a Mac. Essentially, to make this work, I’ve done the following:

  1. Updated some MarcEdit 7.7 code to support how apple passes arguments
  2. Created a shim script that acts as the application wrapper that can be used to create a native mac bundle allowing for plist creation and entitlements to ensure the program works as closely to a mac system as possible.
  3. Created a pkg — where the app and pkg have ben signed and notarized using apples signing tools and my developers code signing certificate.

The results can be seen here:
https://youtu.be/jgkYJEfcZY0?si=zil7j5fZhoeJXyUq

I’ve written up some instructions — you will find them on the download page with the mac option. I will provide a link to bother previous version and the new version — but at this point, all new development will happen on the version being wrapped for use with Wine. Long-term, I likely will be recoding the application to utilize a framework like MAUI or UNO so I can write one interface that will compile natively for multiple systems — but that is likely a 8-12 month project and right now, both only the UNO framework is complete enough to make the transition work. So, while I evaluation frameworks, I’ll be moving all business code into a new code library — MarcEdit.Essentials — which should simplify the move at a later date.

Changes

Updates to this version of the application are as following:

That’s pretty much it. Downloads have been posted, updates should prompt, the org version has been refreshed.

Again, have a happy holiday!

–tr

by reeset at December 23, 2024 08:57 PM

John Mark Ockerbloom

The treachery of images, and the elusiveness of their copyright status

Duke’s Public Domain Day 2025 post discusses René Magritte‘s pipe painting, “La Trahison des Images”, and the difficulty of determining whether US copyright law considers it “published” in 1929, and therefore public domain in 9 days.

A related work we know meets that criterion is his illustrated essay “Les Mots et Les Images”, showing distinctions between words, images, and objects, which his painting also expresses. It’s in the last issue of La Révolution Surréaliste.

by John Mark Ockerbloom at December 23, 2024 06:22 PM

Open Knowledge Foundation

Making Data Work Easier: Highlights from the Global Voices Summit 2024

How often do you interact with data in your daily tasks? For many of us, working with data has become a regular part of our professional and personal lives. But, do you have the coding skills needed to clean and prepare data for deeper analysis?

When asked these questions during our session at the Global Voices Summit 2024 – a global gathering of digital media, knowledge, and activism leaders – the responses were revealing. While half the participants acknowledged working with data daily, only a small fraction raised their hands when asked about coding skills. This reflects a widespread reality: Many people work with data, but few have the tools or expertise to handle it efficiently.

Recognising these challenges, we introduced the Open Data Editor (ODE) during the Summit’s demo session. Designed as a practical tool for non-technical users, ODE is a desktop application that simplifies error detection, editing, and publishing for tabular data. In this demo session, we explored best practices for working with data and demonstrated how ODE can help users identify and correct common dataset errors, making collaboration and sharing easier. Whether you’re a journalist, public official, or researcher, ODE enables you to produce high-quality, reusable datasets without the need for coding expertise.

Because non-technical users often encounter significant difficulties when handling large datasets:

Open Data Editor and Why It Matters?

The ODE, powered by the Frictionless toolkit, addresses common challenges in data handling with a suite of powerful features. For instance, ODE’s simplified error detection automatically identifies issues such as missing headers, duplicate columns, and incorrect data types, ensuring that errors are promptly flagged. Its visual editing tools further enhance usability, allowing users to explore highlighted errors and correct them directly within the application, making the process both intuitive and efficient. Additionally, ODE excels in metadata management, enabling users to edit and maintain structured metadata, ensuring datasets remain clean and well-documented for future use.

ODE empowers individuals and organisations to unlock the full potential of their data. By simplifying complex workflows, it allows users to streamline data handling, enabling them to focus on deriving meaningful insights rather than being hindered by technical challenges. Moreover, ODE significantly enhances collaboration by facilitating the creation of clean, shareable datasets. This ensures teams can work together more effectively, aligning efforts and maintaining consistency. Its robust error detection and correction capabilities also increase accuracy, reduce mistakes, and improve the reliability of data-driven decisions.

In essence, the Open Data Editor enables users to work smarter, not harder, revolutionising how they manage and leverage their data.

Feedback from the Summit

Participants shared insightful feedback at the Global Voices Summit 2024 to further enhance ODE. One key suggestion was to introduce a more robust error-reporting mechanism, allowing users to flag and document errors more effectively. This improvement would enhance transparency and significantly improve the user experience. Additionally, participants emphasised the integration of AI capabilities as a transformative feature, given the growing reliance on AI. Such capabilities could enable smarter recommendations for error correction and further streamline workflows.

Another area of interest was the availability of a web-based version of ODE. A cloud-based solution would provide greater accessibility and ease of use across devices. For instance, some participants highlighted that their organisations do not permit the installation of new applications on company computers and devices, making a web-based option essential.

Lastly, participants stressed the need for a simpler pitch to communicate ODE’s value more clearly and effectively. A concise and compelling message would help ensure that its benefits resonate with a broader audience. These suggestions underscore the ongoing commitment to making ODE an even more powerful, accessible, and user-friendly tool for data practitioners.

Looking to make data management easier and more accessible? The Open Data Editor is here to help. Download the latest version today and experience its transformative features firsthand!

by Nikesh Balami at December 23, 2024 11:05 AM

Artefacto

Top Library Links of 2024

As the year ends, we always like to take a moment to reflect on the past 12 months and what we’ve learned.  A favourite part of this is sharing the most popular newsletter links, which inspire our team and, hopefully, our readers too. And while admittedly our own blog was a little neglected this year, [...]

Continue Reading...

Source

by Artefacto at December 23, 2024 09:30 AM

Raffaele Messuti

My linkblog on fediverse

Hyperlinks are the essence of the web. They enable content discovery, allowing users to navigate between diverse sources of information with different interfaces, graphics, and technologies. Using links is straightforward - you just need to click or tap on them. It's also easy to create new links on the web, you just need to follow some basic rules and conventions.

Lately, there has been a renaissance of linkblogs, blogs focused on sharing curated links. Some notable examples of linkblogs I follow: Simon Willison's Links, Nelson's Linkblog, Kellan's Linkblog. Things Magazine is also a linkblog, as is the wonderful Italian newsletter Link Molto Belli == Very Beautiful Links. My RSS reader follows also many accounts from Pinboard, they are technically personal bookmarks, but I consider them equivalent to linkblogs too.

I've been thinking about creating my own linkblog for a while now. I browse the web and save many bookmarks, some are private, but most can be public. They reflect a curated filter of web content about topics related to my interests (digital libraries and archives, web archiving, books, mountains, and obscure music).

I discarded the idea of using any blogging service, I prefer to self-host my content (like this blog). I could have used a static site generator - there are dozens of them - but I always struggle to find one as simple as I want. Furthermore, since a linkblog is related to content I browse, I want something with less friction than creating a markdown file, pushing to a repo, and waiting for the build. I want an admin interface where I can post quickly, without leaving the browser, maybe with the help of a bookmarklet, a browser extension, or a Tampermonkey script.

I have also evaluated Pocketbase, which is a very nice application platform. You can easily create a data model (with migrations), the UI is minimalistic and beautiful, and you can easily plug in code (this was my first attempt that just publishes an RSS feed, linkbase). It's very easy and powerful, but some things are missing: an HTML interface (which could be quickly done with templ), but more importantly, Fediverse integration. Because yes, for a linkblog an RSS/Atom feed is mandatory, but these days ActivityPub is also a good way to publish content and reach readers.

A full Mastodon instance is overkill, considering the resources and maintenance required. I want something simpler. Here comes Snac, a simple, minimalistic ActivityPub instance written in portable C. A database is not needed, the data is stored in json files in the filesystem, dependencies are minimal, and there is no Javascript.

I first heard of Snac from Stefano Marinelli, who is a lovely source of news from the BSD world, selfhosting, networking and everything related to Unix philosophy. Then from Giacomo Tesio and this good post How to run your own social network (with Snac).

So, this is my linkblog on fediverse, made with Snac: https://href.literarymachin.es/raffaele.

This is how I have installed it. I prefer a containerized deploy, but a static binary build and a systemd service are enough and maybe even more simpler to deploy it.

I build the image on my laptop:

git clone https://codeberg.org/grunfink/snac2.git
cd snac2
docker build -t snac .

Then I transfer the image on the remote server (it's a 12MB image, I don't need a registry!):

docker image save snac | ssh {REMOTE_SERVER} docker load

I run it with this Docker compose:

services:
    href:
        image: snac
        restart: always
        security_opt:
            - no-new-privileges:true
        volumes:
            - ./data:/data
        ports:
            - "8001:8001"
mkdir data
docker compose up -d

The basic configuration needed is changing the hostname:

cat data/data/server.json | jq .host
"href.literarymachin.es"

Then I create my user

docker compose exec href snac adduser /data/data raffaele

And finally, I have configured a nginx proxy like this example.

Follow my linkblog, and suggest more linkblogs to follow!

December 23, 2024 12:00 AM

December 22, 2024

John Mark Ockerbloom

Let me live ‘neath your spell

The first musical Cole Porter and Herbert Fields wrote together was 1929’s Fifty Million Frenchmen, whose title alludes to a 1927 song not actually used in the show. The most memorable song it does use is Porter’s “You Do Something to Me”, in which the two main characters confess their mutual beguilement. Among the song’s many covers, I’m personally fond of Sinéad O’Connor’s, which my spouse and I danced to on our wedding day. Song and show go public domain in 10 days.

by John Mark Ockerbloom at December 22, 2024 10:27 PM

Nick Ruest

Exploring #timesup Twitter Data

Overview

This is a dataset that I hydrated (April of 2023), and was created by Zachary Maiorana, Pablo Morales Henry, and Jennifer Weintraub. It is the “#metoo Digital Media Collection - Hashtag: timesup”, which is part of the the Schlesinger Library #timesup Digital Media Collection. The original dataset contains 3,720,729 Tweet IDs, and I was able to hydrate 2,518,092 tweets. Giving me a Hydration Rate of 67.68%. The hydrated dataset covers from October 15, 2017 through June 1, 2020.

#timesup tweet volume #timesup Tweet Volume
#timesup wordcloud #timesup wordcloud

Top languages

Using the full line-oriented JSON dataset and twut:

import io.archivesunleashed._

val tweets = "timesup.jsonl"
val df = spark.read.json(tweets)

df.language
  .groupBy("lang")
  .count()
  .orderBy(col("count").desc)
  .show(10)
+----+-------+                                                                  
|lang|  count|
+----+-------+
|  en|2209208|
| qme| 112478|
|  es|  51085|
| und|  23017|
|  ja|  22439|
| qht|  15806|
|  pt|  15539|
|  fr|  14892|
|  ko|   8150|
|  th|   7134|
+----+-------+

Top tweeters

Using timesup-user-info.csv from df.userInfo.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("timesup-user-info"), and pandas:

#timesup Top Tweeters #timesup Top Tweeters

Retweets

Using the full line-oriented JSON dataset and twut:

import io.archivesunleashed._

val tweets = "timesup.jsonl"
val df = spark.read.json(tweets)

df.mostRetweeted.show(10, false)
+-------------------+-----------------+                                         
|tweet_id           |max_retweet_count|
+-------------------+-----------------+
|950562797183799296 |26883            |
|971801461024686080 |21628            |
|950137819808305152 |19291            |
|950215871200456705 |17209            |
|956944474596196352 |12927            |
|958067014597300224 |12648            |
|950211856630628353 |11741            |
|950140126591504384 |11672            |
|1034341408361078784|10176            |
|950564161913868293 |10161            |
+-------------------+-----------------+    

From there, we can use append the tweet ID to https://twitter.com/i/status/ to see the tweet. Here’s the top three:

  1. 26,883

  2. 21,628

  3. 19,291

Top Hashtags

Using timesup-hashtags.csv from df.hashtags.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("timesup-hashtags"), and pandas:

#timesup hashtags #timesup hashtags

Top URLs

Using the full line-oriented JSON dataset and twut:

import io.archivesunleashed._

val tweets = "timesup.jsonl"
val df = spark.read.json(tweets)

df.urls
  .groupBy("url")
  .count()
  .orderBy(col("count").desc)
  .show(10, false)
+----------------------------------------------------------+-----+
|url                                                       |count|
+----------------------------------------------------------+-----+
|https://twitter.com/ivankatrump/status/950561402053447685 |46826|
|https://t.co/AyUNRZICjI                                   |20294|
|https://twitter.com/jtimberlake/status/950136611391471616 |13988|
|https://t.co/5jAElrqeFz                                   |12572|
|https://twitter.com/IvankaTrump/status/950561402053447685 |8287 |
|https://t.co/si9E5MPw8I                                   |7801 |
|http://www.karlyletomms.com/single-post/2018/01/07/MeToo  |5459 |
|https://t.co/OmJCYUkD6W                                   |5298 |
|https://twitter.com/GeeksOfColor/status/950165381607391232|4534 |
|https://t.co/3TMsQFmboS                                   |4455 |
+----------------------------------------------------------+-----+

Top media urls

Using the full line-oriented JSON dataset and twut:

import io.archivesunleashed._

val tweets = "timesup.jsonl"
val df = spark.read.json(tweets)
  
df.mediaUrls
  .filter(col("media_url").isNotNull)
  .groupBy("media_url")
  .count()
  .orderBy(col("count").desc)
  .show(10, false)  
+------------------------------------------------------------------------------------------+-----+
|media_url                                                                                 |count|
+------------------------------------------------------------------------------------------+-----+
|https://pbs.twimg.com/media/DS-RFRlVwAAvCJr.jpg                                           |8463 |
|https://pbs.twimg.com/media/DTDwIDtVQAEPq0t.jpg                                           |6749 |
|https://pbs.twimg.com/media/DTDgvL6VMAEJz-Y.jpg                                           |3793 |
|https://video.twimg.com/amplify_video/950227596121288705/vid/640x360/60Eys_8CYjy8RiIh.mp4 |3327 |
|https://video.twimg.com/amplify_video/950227596121288705/pl/mTRKmGFdrLZ1g2l4.m3u8         |3327 |
|https://video.twimg.com/amplify_video/950227596121288705/vid/1280x720/OK_RCHIfzkFpFwHj.mp4|3327 |
|https://video.twimg.com/amplify_video/950227596121288705/vid/320x180/uVgPuVP2vwHFRYkd.mp4 |3327 |
|https://pbs.twimg.com/media/DS-QNbPU0AAS5OT.jpg                                           |3310 |
|https://pbs.twimg.com/media/DS-JKK3UQAEOY0X.jpg                                           |2700 |
|https://pbs.twimg.com/media/DS5Lz7tVoAEjmAX.jpg                                           |2322 |
+------------------------------------------------------------------------------------------+-----+

A couple years ago I created a juxta (collage) of the images from this dataset. It features 298,158 images, and you can check it out here.

Emotion score

Using timesup-text.csv from df.text.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("timesup-text"), Polars, and j-hartmann/emotion-english-distilroberta-base:

Emotion Distribution in #timesup Tweets Emotion Distribution in #timesup Tweets

December 22, 2024 05:00 AM