Copying. Mimesis. Autofiction. Bibliomancy. Copying. Mimesis. Autofiction. Bibliomancy. Copying. Mimesis. Autofiction. Bibliomancy. Borges.
Copying. Mimesis. Autofiction. Bibliomancy. Copying. Mimesis. Autofiction. Bibliomancy. Copying. Mimesis. Autofiction. Bibliomancy. Borges.
I work at a non-profit academic institution, on a site that manages, searches, and displays digitized historical materials: The Science History Institute Digital Collections.
Much of our stuff is public domain, and regardless we put this stuff on the web to be seen and used and shared. (Within the limits of copyright law and fair use; we are not the copyright holders of most of it). We have no general problem with people scraping our pages.
The problem is that, like many of us, our site is being overwhelmed with poorly behaved bots. Lately one of the biggest problems is with bots clicking on every possible combination of facet limits in our “faceted search” — this is not useful for them, and it overwhelms our site. “Search” pages are one of our most resource-constrained category of page in our present site, adding to the injury. Peers say even if we scaled up (auto or not) — the bots sometimes scale up to match anyway!
One option would be putting some kind of “Web Application Firewall” (WAF) in front of the whole app. Our particular combination of team and budget and platform (heroku) makes a lot of these options expensive for us in licensing, staff time to manage, or both. Another option is certainly putting the the whole thing behind (ostensibly free) CloudFlare CDN and using its built-in WAF, but we’d like to avoid giving our DNS over to CloudFlare, I’ve heard mixed reviews of CloudFlare free staying free, and generally am trying to avoid contributing to CloudFlare’s monopoly unaccountable control of the internet.`
Although ironically then, the solution we arrived at is still using CloudFlare, but Cloudflare’s Turnstile “captcha replacement”, one of those things that gives you the “check this box” or more often entirely interactive “checking if you are a bot” UXs.
[If you’re a tldr look at the code type, here’s the initial implementation PR in our open repo, there are some bug fixes since then]
While this still might unfortunately lock people using unconventional browsers etc out (just the latest of many complaints on HackerNews), we can use this to only protect our search pages. Most of our traffic comes directly from Google to an individual item detail page, which we can now leave completely out of it. We have complete control of allow-listing traffic based on whatever characteristics, when to present the challenge, etc. And it turns out we had a peer at another institution who had taken this approach and found it successful, so that was encouraging.
While typical documented Turnstile usage involves protecting form submissions, we actually want to protect certain urls, even when accessed via GET. Would this actually work well? What’s the best way to implement it?
Fortunately, when asking around on a chat for my professional community of librarian and archivist software hackers, Joe Corall from Lehigh University said they had done the exact same thing (even in response to the same problem, bots combinatorially exploring every possible facet value), and had super usefully written it up, and it had been working well for them.
Joe’s article and the flowchart it contains is worth looking it. His implementation is as a Drupal plugin (and used in at least several Islandora instances); the VuFind library discovery layer recently implemented a similar approach. We have a Rails app, so needed to implement it ourselves — but with Joe paving the way (and patiently answering our questions, so we could start with the parameters that worked for him), it was pretty quick work, bouyed by the confidence this approach wasn’t just an experiment in the blue, but had worked for a similar peer.
Joe allow-listed certain client domain names based on reverse IP lookup, but I’ve started without that, not wanting the performance hit on every request if I can avoid it. Joe also allow-listed their “on campus” IPs, but we are not a university and only have a few staff “on campus” and I always prefer to show the staff the same thing our users are seeing — if it’s inconvenient and intolerable, we want to feel the pain so we fix it, instead of never even seeing the pain and not knowing our users are getting it!
I’m going to explain and link to how we implemented this in a Rails app, and our choices of parameters for the various parameterized things. But also I’ll tell you we’ve written this in a way that paves the way to extracting to a gem — kept everything consolidated in a small number of files and very parameterized — so if there’s interest let me know. (Code4Lib-ers, our slack is a great place to get in touch, I’m jrochkind
).
Here’s the implementing PR. It is written in such a way to keep the code conslidated for future gem extraction, all in the BotDetectController class, which means kind of weirdly there is some code to inject in class methods in the controller. While it does turnstile now, it’s written with variable/class names such that analagous products could be made available.
We were already using rack-attack to rate-limit. We added a “track” monitor with our code to decide when a client had passed a rate-limit gate to require a challenge. We start with allowing 10 requests per 12 hours (Joe at Lehigh did 20 per 24 hours), batched together in subnets. (Joe did subnets too, but we do smaller /24
(ie x.y.z.*) for ipv4 instead of Joe’s larger /16
(x.y.*.*)).
Note that rack-attack does not use sliding/rolling-windows for rate limits, but fixed windows that reset after window period. This makes a difference especially when you use such a long period as we are, but it’s not a problem with our very low count per period, and it does keep the RAM extremely effiicent (just an integer count per rate limit bucket).
When the rate limit is reached, the rack-attack block just sets a key/value in the rack_env to tell another component that a challenge is required. (setting in the session may have worked, but we want to be absolutely sure this will work even if client is not storing cookies, and this is really only meant as this-request state, so rack env seemed the good way to set state in rack-attack that could be seen in a rails controller)
There’s a Rails before_action filter that we just put on the application-wide ApplicationController, that looks for the “bot challenge key” required in the rack env — if present, and there isn’t anything in the session saying they have already passed a bot challenge, then we redirect to a “challenge” page, that will display/activate Turnstile.
We simply put the original/destination URL in a query param on that page. (And include logic to refuse to redirect to anything but a relative path on same host, to avoid any nefarious uses).
One action in our BotDetectController just displays the turnstile challenge. The cloudflare turnstile callback gives us a token we need to verify server-side with turnstile API to verify challenge was really passed.
the front-end does a JS/xhr/fetch request to the second action in our BotDetectController. The back-end verify action makes the API call to turnstile, and if challenge passed, sets a value in Rails (encrypted and signed, secure) session with time of pass, so the before_action guard can give the user access.
if the JS in front gets a go-ahead from back-end, it uses JS document.replace
to go to original destination. This conveniently removes the challenge page from the user’s browser history, as if it never happened, browser back button still working great.
In most cases the challenge page, if non-interactive, wont’ be displayed for more than a few seconds. (the language has been tweaked since these screenshots).
We currently have a ‘pass’ good for 24 hours — once you pass a turnstile challenge, if your cookies/session are intact, you won’t be given another one for 24 hours no matter how much traffic. All of this is easily configurable.
If the challenge DOES fail for some reason, the user may be looking at the Challenge page with one of two kinds of failures, and some additional explanatory text and contact info.
This particular flow only works for GET requests. It could be expanded to work for POST requests (with an invisible JS created/submitted form?), but our initial use case didn’t require it, so for now the filter just logs a warning and fails for POST.
This flow also isn’t going to work for fetch/ajax requests, it’s set up for ordinary navigation, since it redirects to a challenge then redirects back. Our use case is only protecting our search pages — but the blacklight search in our app has a JS fetch for “facet more” behavior. Couldn’t figure out a good/easy way to make this work, so for now we added an exemption config, and just exempt requests to the #facet action that look like they’re coming from fetch. Not bothered that an “attacker” could escape our bot detection for this one action; our main use case is stopping crawlers crawling indiscriminately, and I don’t think it’ll be a problem.
To get through the bot challenge requires a user-agent to have both JS and cookies enabled. JS may have been required before anyway (not sure), but cookies were not. Oh well. Only search pages are protected by the bot challenge.
The Lehigh implementation does a reverse-lookup of the client IP, and allow-lists clients from IP’s that reverse lookup to desirable and well-behaved bots. We don’t do that, in part because I didn’t want the performance hit of the reverse-lookup. We have a Sitemap, and in general, I’m not sure we need bots crawling our search results pages at all… although I’m realizing as I write this that our “Collection” landing pages are included (as they show search results)… may want to exempt them, we’ll see how it goes.
We don’t have any client-based allow-listing… but would consider just exempting any client that has a user-agent admitting it’s a bot, all our problematic behavior has been from clients with user-agents appearing to be regular browsers (but obviously automated ones, if they are being honest).
We could possibly only enable the bot challenge when the site appears “under load”, whether that’s a certain number of overall requests per second, a certain machine load (but any auto-scaling can make that an issue), or size of heroku queue (possibly same).
We could use more sophisticated fingerprinting for rate limit buckets. Instead of IP-address-based, colleague David Cliff from Northeastern University has had success using HTTP user-agent, accept-encoding, and accept-language to fingerprint actors across distributed IPs, writing:
I know several others have had bot waves that have very deep IP address pools, and who fake their user agents, making it hard to ban.
We had been throttling based on the most common denominator (url pattern), but we were looking for something more effective that gave us more resource headroom.
On inspecting the requests in contrast to healthy user traffic we noticed that there were unifying patterns we could use, in the headers.
We made a fingerprint based on them, and after blocking based on that, I haven’t had to do a manual intervention since.
def fingerprint
result = “#{env[“HTTP_ACCEPT”]} | #{env[“HTTP_ACCEPT_ENCODING”]} | #{env[“HTTP_ACCEPT_LANGUAGE”]} | #{env[“HTTP_COOKIE”]}”
Base64.strict_encode64(result)
end
…the common rule we arrived at mixed positive/negative discrimination using the above
request.env["HTTP_ACCEPT"].blank? && request.env["HTTP_ACCEPT_LANGUAGE"].blank? && request.env["HTTP_COOKIE"].blank? && (request.user_agent.blank? || !request.user_agent.downcase.include?("bot".downcase))
so only a bot that left the fields blank and lied with a non-bot user agent would be affected
We could also base rate limit or “discriminators” for rate limit buckets on info we can look up from the client IP address, either a DNS or network lookup (performance worries), or perhaps a local lookup using the free MaxMind databases that also include geocoding and some organizational info.
Too early to say, we just deployed it!
I sometimes get annoyed when people blog like this, but being the writer, I realized that if I wait a month to see how well it’s working to blog — I’ll never blog! I have to write while it’s fresh and still interesting to me.
But encouraged that colleagues say very similar approaches have worked for them. Thanks again to Joe Corral for paving the way with a drupal implementation, blogging it, discussing it on chat, and answering questions! And all the other librarian and cultural heritage technologists sharing knowledge and collaboration on this and many other topics!
I can say that already it is being triggered a lot, by bots that don’t seem to get past it. This includes google bot and Meta-ExternalAgent
(which I guess is AI-related; we have no particular use-based objections we are trying to enforce here, just trying to preserve our resources). While Google also has no reason to combinatorially explore every facet combination (and has a sitemap), I’m not sure if I should exempt known resource-considerate bots from the challenge (and whether to do so by trusting user-agent or not; our actual problems have all been with ordinary-browser-appearing user-agents).
In the next few days we’re releasing an update to Talpa Search—a major jump in Talpa’s ability to find books and other media within library catalogs.
Today we’re released a set of 200 “benchmark,” or test searches. Together with hundreds more, these are the searches we use to assess Talpa’s quality, test particular tweaks and features, and track Talpa’s improvement over time.
This set is named “What’s that book?” It consists of 200 searches, each of which has a single best answer. For example, the best answer to the search “prince harry memoir” is Spare . They were created by LibraryThing staff, generally about their own books, or books they know well.
By “best answer” we don’t mean the questions are all easy, or even clear. Some examples:
The searches cover different types of searches:
Talpa handles some other broad types, such as date-restricted titles (“1980s teen films”), author searches (“Lisa Carey”) and subjects (“persian art”), but these don’t have a single best answer, so they’re not included in this set.
We’ve included typos and spelling mistakes, because a good system should be able to handle these:
We’ve tried to include different ways a patron might word their search:
To mirror patron’s interest, half the set are to new books, published 2023–2024. The other half are published before 2023. Although Talpa Search can handle movies and music, only book searches are included in this set.
That’s it!
For fun, we’ve hidden the answers so you can test yourself. If you want all the answers unhidden, click show all answers. The searches are also available as a text file, here.
If you find any problems, let us know!
Q: children’s book boy baked into a cake
A: Show answerIn the Night Kitchen by Maurice Sendak
Score: 100 – Position: 1
Q: retired female assassins
A: Show answerKillers of a Certain Age by Deanna Raybourn
Score: 100 – Position: 1
Q: internet social history washington post reporter
A: Show answerExtremely Online: The Untold Story of Fame, Influence, and Power on the Internet by Taylor Lorenz
Score: 100 – Position: 1
Q: popular serial killer novel set in missouri
A: Show answerAll the Colors of the Dark by Chris Whitaker
Score: 0 – Not in results list.
Q: maine kids book with out of control donut machine
A: Show answerHomer Price by Robert McCloskey
Score: 80 – Position: 2
Q: seventh Cormoran Strike book
A: Show answerThe Running Grave by Robert Galbraith
Score: 100 – Position: 1
Q: children’s book about dutch resistance and windmills
A: Show answerThe Winged Watchman by Hilda Van Stockum
Score: 100 – Position: 1
Q: new gamache mystery
A: Show answerThe Grey Wolf by Louise Penny
Score: 100 – Position: 1
Q: World war 1 historical fantasy taking place in flanders
A: Show answerThe Warm Hands of Ghosts by Katherine Arden
Score: 30 – Position: 10
Q: artemis fowl graphic novel after eternity code
A: Show answerThe Opal Deception: The Graphic Novel by Eoin Colfer
Score: 80 – Position: 2
Q: orc buys coffee shop
A: Show answerLegends & Lattes by Travis Baldree
Score: 100 – Position: 1
Q: flat stanley in france
A: Show answerFramed in France by Josh Greenhut
Score: 100 – Position: 1
Q: new reagan bio
A: Show answerReagan: His Life and Legend by Max Boot
Score: 70 – Position: 3
Q: historical fiction set in irish hospital during the great flu
A: Show answerThe Pull of the Stars by Emma Donoghue
Score: 100 – Position: 1
Q: sequel to the sparrow
A: Show answerChildren of God by Mary Doria Russell
Score: 100 – Position: 1
Q: ya mystery, dark academia, black author, missing roommate
A: Show answerWhere Sleeping Girls Lie by Faridah Àbíké-Íyímídé
Score: 30 – Position: 7
Q: kids book where a boys best friend drowns
A: Show answerBridge to Terabithia by Katherine Paterson
Score: 80 – Position: 2
Q: Historical fiction about black bookbinder in victorian era
A: Show answerThe Library Thief by Kuchenga Shenjé
Score: 60 – Position: 4
Q: Fiction, banned books in a southern town, little free library
A: Show answerLula Dean’s Little Library of Banned Books by Kirsten Miller
Score: 100 – Position: 1
Q: History of time between Lincoln’s election and the Civil War by popular author
A: Show answerThe Demon of Unrest: A Saga of Hubris, Heartbreak, and Heroism at the Dawn of the Civil War by Erik Larson
Score: 0 – Not in results list.
Q: recent hot memoir of tech journalist
A: Show answerBurn Book: A Tech Love Story by Kara Swisher
Score: 30 – Position: 11
Q: patrick stewart bio
A: Show answerMaking It So: A Memoir by Patrick Stewart
Score: 100 – Position: 1
Q: novel where four strangers trap two gay dads and their daughter in a summer cabin
A: Show answerThe Cabin at the End of the World by Paul G. Tremblay
Score: 100 – Position: 1
Q: that book about hunting a whale
A: Show answerMoby Dick by Herman Melville
Score: 100 – Position: 1
Q: Campus novel about undocumented student at harvard university
A: Show answerCatalina by Karla Cornejo Villavicencio
Score: 0 – Not in results list.
Q: academic book about “paradigm shifts” in science
A: Show answerThe Structure of Scientific Revolutions by Thomas S. Kuhn
Score: 100 – Position: 1
Q: throwing muses memoir
A: Show answerRat Girl: A Memoir by Kristin Hersh
Score: 100 – Position: 1
Q: picture book with no words about a snowman
A: Show answerThe Snowman by Raymond Briggs
Score: 100 – Position: 1
Q: woman manages hotel on Nantucket that has a ghost
A: Show answerThe Hotel Nantucket by Elin Hilderbrand
Score: 100 – Position: 1
Q: Recent memoir of author who was stabbed
A: Show answerKnife: Meditations After an Attempted Murder by Salman Rushdie
Score: 100 – Position: 1
Q: children’s book about boy throwing things into a tree
A: Show answerStuck by Oliver Jeffers
Score: 70 – Position: 3
Q: steampunk in victorian london with Japanese watchmaker
A: Show answerThe Watchmaker of Filigree Street by Natasha Pulley
Score: 100 – Position: 1
Q: Marvel universe novel about retired superhero turned private investigator
A: Show answerBreaking the Dark by Lisa Jewell
Score: 100 – Position: 4
Q: Fantasy fiction about books with different powers/magic
A: Show answerThe Book of Doors by Gareth Brown
Score: 0 – Not in results list.
Q: book about gay friends and AIDs in 1980s Chicago
A: Show answerThe Great Believers by Rebecca Makkai
Score: 100 – Position: 1
Q: science fiction novel where Greek and Chinese science are both true
A: Show answerCelestial Matters by Richard Garfinkle
Score: 0 – Not in results list.
Q: vanderbeekers
A: Show answerThe Vanderbeekers of 141st Street by Karina Yan Glaser
Score: 100 – Position: 1
Q: Historical fantasy about servant with magical powers, set during Spanish Golden Age
A: Show answerThe Familiar by Leigh Bardugo
Score: 0 – Not in results list.
Q: second roselynde chronicles
A: Show answerAlinor by Roberta Gellis
Score: 80 – Position: 2
Q: Fiction romance about a stand up comedian trying to figure out his breakup
A: Show answerGood Material by Dolly Alderton
Score: 100 – Position: 1
Q: french elephant lives in the city
A: Show answerThe Story of Babar: The Little Elephant by Jean de Brunhoff
Score: 80 – Position: 2
Q: barefoot contessa memoir
A: Show answerBe Ready When the Luck Happens: A Memoir by Ina Garten
Score: 100 – Position: 1
Q: series with 3 orphan kids first book
A: Show answerThe Bad Beginning by Lemony Snicket
Score: 100 – Position: 1
Q: estranged, grieving brothers in ireland
A: Show answerIntermezzo by Sally Rooney
Score: 30 – Position: 15
Q: obituaries from an AI
A: Show answerRemember You Will Die: A Novel by Eden Robins
Score: 80 – Position: 2
Q: third sylvia day crossfire book
A: Show answerEntwined With You by Sylvia Day
Score: 100 – Position: 1
Q: New sci fi about robot girlfriend
A: Show answerAnnie Bot by Sierra Greer
Score: 100 – Position: 1
Q: men coming out of the attic
A: Show answerThe Husbands by Holly Gramazio
Score: 80 – Position: 2
Q: late show host cookbook
A: Show answerDoes This Taste Funny? Recipes Our Family Loves by Stephen Colbert
Score: 0 – Not in results list.
Q: Literary fiction about queer iranian immigrant guy
A: Show answerMartyr! by Kaveh Akbar
Score: 50 – Position: 5
Q: picture book about ducks in boston
A: Show answerMake Way for Ducklings by Robert McCloskey
Score: 100 – Position: 1
Q: book about AI from One Useful Thing guy
A: Show answerCo-Intelligence: Living and Working with AI by Ethan Mollick
Score: 80 – Position: 2
Q: Fiction about teen girl boxers in Reno, nevada
A: Show answerHeadshot by Rita Bullwinkel
Score: 0 – Not in results list.
Q: Exploration of plants’ intelligence, nonfiction
A: Show answerThe Light Eaters: How the Unseen World of Plant Intelligence Offers a New Understanding of Life on Earth by Zoë Schlanger
Score: 100 – Position: 1
Q: Mystery novel about girl filming documentary about her missing mother
A: Show answerThe Reappearance of Rachel Price by Holly Jackson
Score: 30 – Position: 14
Q: memoir of prince harry
A: Show answerSpare by Prince Harry, Duke of Sussex
Score: 100 – Position: 1
Q: Non-fiction book about widow discovering husband’s secret life
A: Show answerThe Widow’s Guide to Dead Bastards by Jessica Waite
Score: 60 – Position: 4
Q: man turns into a cochroach
A: Show answerThe Metamorphosis by Franz Kafka
Score: 100 – Position: 1
Q: tintin book after Explorers on the Moon
A: Show answerThe Calculus Affair by Hergé
Score: 100 – Position: 1
Q: amis book where soviets control britain
A: Show answerRussian Hide and Seek by Kingsley Amis
Score: 100 – Position: 1
Q: maine kids book where hero has a pet skunk
A: Show answerHomer Price by Robert McCloskey
Score: 0 – Not in results list.
Q: Teenager disappears from Adirondack summer camp
A: Show answerThe God of the Woods by Liz Moore
Score: 100 – Position: 1
Q: most recent starlight’s shadow book
A: Show answerCapture the Sun by Jessie Mihalik
Score: 50 – Position: 5
Q: Fantasy about fox spirit girl in manchuria
A: Show answerThe Fox Wife by Yangsze Choo
Score: 100 – Position: 1
Q: Book where a girl has to walk or else she will die
A: Show answerA Short Walk Through a Wide World by Douglas Westerbeke
Score: 0 – Not in results list.
Q: all art is alive comic
A: Show answerDoodleville by Chad Sell
Score: 0 – Not in results list.
Q: popular history of period after bronze age collapse
A: Show answerAfter 1177 B.C. : The Survival of Civilizations by Eric H. Cline
Score: 80 – Position: 2
Q: nonfiction crack cocaine era
A: Show answerWhen Crack Was King: A People’s History of a Misunderstood Era by Donovan X. Ramsey
Score: 100 – Position: 1
Q: memoir by paul fussell’s son
A: Show answerMuscle: Confessions of an Unlikely Bodybuilder by Samuel W. Fussell
Score: 0 – Not in results list.
Q: book that is an allegory of genesis story in california
A: Show answerEast of Eden by John Steinbeck
Score: 100 – Position: 1
Q: woman in office accidentally gains access to coworkers’ emails
A: Show answerI Hope This Finds You Well by Natalie Sue
Score: 100 – Position: 1
Q: thriller with cat and el morgan
A: Show answerMirrorland by Carole Johnstone
Score: 0 – Not in results list.
Q: Croatian children’s book about a brave shoemaker’s apprentice
A: Show answerThe Brave Adventures of Lapitch by Ivana Brlic-Mazuranic
Score: 100 – Position: 1
Q: murderbot book after network effect
A: Show answerFugitive Telemetry by Martha Wells
Score: 100 – Position: 1
Q: Navalny book
A: Show answerPatriot: A Memoir by Alexei Navalny
Score: 100 – Position: 3
Q: that dystopian book with jonas and gabriel
A: Show answerThe Giver by Lois Lowry
Score: 80 – Position: 2
Q: sequel to jurrasic park book
A: Show answerThe Lost World by Michael Crichton
Score: 100 – Position: 1
Q: ancient greek dream manual
A: Show answerThe Interpretation of Dreams by Artemidorus Daldianus
Score: 100 – Position: 1
Q: french girl makes deal with devil becomes immortal
A: Show answerThe Invisible Life of Addie LaRue by V. E. Schwab
Score: 100 – Position: 1
Q: sequel to To Tame a Sheikh
A: Show answerTo Tempt a Sheikh by Olivia Gates
Score: 100 – Position: 1
Q: Popular 2024 romance book about children’s librarian
A: Show answerFunny Story by Emily Henry
Score: 100 – Position: 1
Q: dolly parton and sister cookbook
A: Show answerGood Lookin’ Cookin’: A Year of Meals – A Lifetime of Family, Friends, and Food [A Cookbook] by Dolly Parton
Score: 100 – Position: 1
Q: book that became american fiction movie
A: Show answerErasure by Percival Everett
Score: 0 – Not in results list.
Q: picture book about axolotls
A: Show answerNot A Monster by Claudia Guadalupe Martinez
Score: 100 – Position: 4
Q: childrens books about chihuahua with giant ears second book
A: Show answerSkippyjon Jones in the Doghouse by Judy Schachner
Score: 80 – Position: 2
Q: most recent kingsbridge book
A: Show answerThe Armor of Light by Ken Follett
Score: 100 – Position: 1
Q: historical fiction boston journalist wwii
A: Show answerThe Rumor Game by Thomas Mullen
Score: 0 – Not in results list.
Q: heinlein novel where kids are trapped on an alien planet
A: Show answerTunnel in the Sky by Robert A. Heinlein
Score: 100 – Position: 1
Q: Memoir of a woman whos husband died while running a half marathon
A: Show answerHere After: A Memoir by Amy Lin
Score: 30 – Position: 9
Q: sci-fi novel with telepathically-linked dogs in a medieval world
A: Show answerA Fire upon the Deep by Vernor Vinge
Score: 80 – Position: 2
Q: romcom with bartender and librarian
A: Show answerFunny Story by Emily Henry
Score: 0 – Not in results list.
Q: persian epic about creation and the gods
A: Show answerShahnameh: The Persian Book of Kings by Firdausi
Score: 100 – Position: 1
Q: New book on the history of hip hop
A: Show answerHip-Hop Is History by Questlove
Score: 100 – Position: 1
Q: poems about taylor swift songs
A: Show answerInvisible Strings: 113 Poets Respond to the Songs of Taylor Swift by Kristie Frederick Daugherty
Score: 60 – Position: 4
Q: prince harry memoir
A: Show answerSpare by Prince Harry, Duke of Sussex
Score: 100 – Position: 1
Q: jesus wife papyrus hoax
A: Show answerVeritas: A Harvard Professor, a Con Man and the Gospel of Jesus’s Wife by Ariel Sabar
Score: 100 – Position: 1
Q: where’s my binkit
A: Show answerDinosaur’s Binkit by Sandra Boynton
Score: 100 – Position: 1
Q: children’s book with grandmother and bowl of mush
A: Show answerGoodnight Moon by Margaret Wise Brown
Score: 100 – Position: 1
Q: bono memoir
A: Show answerSurrender: 40 Songs, One Story by Bono
Score: 100 – Position: 1
Q: ya comedic novel about beauty queens stranded on desert island
A: Show answerBeauty Queens by Libba Bray
Score: 100 – Position: 1
Q: Nonfiction book about the science of successful communicators
A: Show answerSupercommunicators: How to Unlock the Secret Language of Connection by Charles Duhigg
Score: 100 – Position: 1
Q: recent popular historical fiction about women in vietnam war
A: Show answerThe Women by Kristin Hannah
Score: 100 – Position: 1
Q: sequel to 1177
A: Show answerAfter 1177 B.C. : The Survival of Civilizations by Eric H. Cline
Score: 100 – Position: 1
Q: children’s book about a girl who spies on her friends and takes notes
A: Show answerHarriet the Spy by Louise Fitzhugh
Score: 100 – Position: 1
Q: nurse in vietnam novel
A: Show answerThe Women by Kristin Hannah
Score: 100 – Position: 1
Q: College friends reunite for week in Maine
A: Show answerHappy Place by Emily Henry
Score: 100 – Position: 1
Q: fourth magic treehouse
A: Show answerPirates Past Noon by Mary Pope Osborne
Score: 100 – Position: 1
Q: woman and iranian family argue about a house novel
A: Show answerHouse of Sand and Fog by Andre Dubus III
Score: 100 – Position: 1
Q: Newest Tana French novel
A: Show answerThe Hunter by Tana French
Score: 100 – Position: 1
Q: stan lee graphic novel bio
A: Show answerI Am Stan: A Graphic Biography of the Legendary Stan Lee by Tom Scioli
Score: 100 – Position: 1
Q: roald dahl book with kid in gypsy caravan
A: Show answerDanny the Champion of the World by Roald Dahl
Score: 100 – Position: 1
Q: book where all wheat and rice dies. it takes place in britain
A: Show answerThe Death of Grass by John Christopher
Score: 100 – Position: 1
Q: ex-amazon employee memoir
A: Show answerExit Interview: The Life and Death of My Ambitious Career by Kristi Coulter
Score: 100 – Position: 2
Q: The 22nd book in the Mitch Rapp series
A: Show answerCode Red by Vince Flynn
Score: 100 – Position: 1
Q: recent ireland dystopia novel
A: Show answerProphet Song by Paul Lynch
Score: 100 – Position: 1
Q: rupaul new autobiography
A: Show answerThe House of Hidden Meanings: A Memoir by RuPaul
Score: 100 – Position: 1
Q: 2024 fantasy novel slavic folklore baba yaga
A: Show answerWhen Among Crows by Veronica Roth
Score: 30 – Position: 17
Q: british murder mystery where a fictional version of the author is a character
A: Show answerThe Word Is Murder by Anthony Horowitz
Score: 50 – Position: 5
Q: 2024 nonfiction about memory by memory researcher
A: Show answerWhy We Remember: Unlocking Memory’s Power to Hold on to What Matters by Charan Ranganath
Score: 30 – Position: 9
Q: recent book by calvin and hobbes creator
A: Show answerThe Mysteries by Bill Watterson
Score: 80 – Position: 2
Q: Funny book about single mom who creates popular OnlyFans account
A: Show answerMargo’s Got Money Troubles by Rufi Thorpe
Score: 30 – Position: 14
Q: contemporary novel about divorce in new york city, multiple narrators
A: Show answerFleishman Is in Trouble by Taffy Brodesser-Akner
Score: 70 – Position: 3
Q: challenger book by chernobyl author
A: Show answerChallenger: A True Story of Heroism and Disaster on the Edge of Space by Adam Higginbotham
Score: 70 – Position: 3
Q: romance novel about tv dating show with a plus sized lead
A: Show answerOne to Watch by Kate Stayman-London
Score: 100 – Position: 1
Q: Memoir of time in psych ward, reading books by fellow mad women
A: Show answerCommitted: On Meaning and Madwomen by Suzanne Scanlon
Score: 60 – Position: 4
Q: collection with tower of babylon story
A: Show answerStories of Your Life and Others by Ted Chiang
Score: 100 – Position: 1
Q: writing advice from steven king
A: Show answerOn Writing: A Memoir of the Craft by Stephen King
Score: 100 – Position: 1
Q: crichton book with the nanotech
A: Show answerPrey by Michael Crichton
Score: 100 – Position: 1
Q: peter brown autobiography
A: Show answerJourneys of the Mind: A Life in History by Peter Brown
Score: 100 – Position: 1
Q: Mystery set on a reality tv love island dating show
A: Show answerOne Perfect Couple by Ruth Ware
Score: 50 – Position: 5
Q: Retelling of Huck Finn from Jim’s point of view
A: Show answerJames by Percival Everett
Score: 100 – Position: 1
Q: Sequel to YA fantasy about home for magical misfit children
A: Show answerSomewhere Beyond the Sea by TJ Klune
Score: 0 – Not in results list.
Q: novel Rushdie wrote after the knife attack
A: Show answerVictory City by Salman Rushdie
Score: 80 – Position: 2
Q: Romance about girl who loves pro golfer
A: Show answerFangirl Down by Tessa Bailey
Score: 40 – Position: 6
Q: second book of Meier’s Marginal Jew
A: Show answerA Marginal Jew: Rethinking the Historical Jesus, Vol. 2 – Mentor, Message, and Miracles by John P. Meier
Score: 100 – Position: 1
Q: living boy meets grandfather in ghost world
A: Show answerGhostopolis by Doug TenNapel
Score: 70 – Position: 3
Q: what is the third bond book?
A: Show answerMoonraker by Ian Fleming
Score: 100 – Position: 1
Q: gisele cookbook
A: Show answerNourish: Simple Recipes to Empower Your Body and Feed Your Soul: A Healthy Lifestyle Cookbook by Gisele Bündchen
Score: 100 – Position: 2
Q: romance with Derek Pender
A: Show answerThe Rule Book by Sarah Adams
Score: 100 – Position: 1
Q: Alphabetical essays on climate change
A: Show answerH Is for Hope: Climate Change from A to Z by Elizabeth Kolbert
Score: 30 – Position: 7
Q: Robot girlfriend discovers autonomy
A: Show answerAnnie Bot by Sierra Greer
Score: 100 – Position: 1
Q: recent book about captain cook’s last voyage
A: Show answerThe Wide Wide Sea: Imperial Ambition, First Contact and the Fateful Final Voyage of Captain James Cook by Hampton Sides
Score: 100 – Position: 1
Q: Historical fiction about the panama canal construction and people involved
A: Show answerThe Great Divide by Cristina Henríquez
Score: 30 – Position: 7
Q: romance prince of england and son of the president
A: Show answerRed, White & Royal Blue by Casey McQuiston
Score: 100 – Position: 1
Q: Female friends at D.C. boarding house duringheight of McCarthyism
A: Show answerThe Briar Club by Kate Quinn
Score: 0 – Not in results list.
Q: Magical realism, fiction, two boys who disappear for 6 months and can’t recall what happened
A: Show answerThe Lost Story by Meg Shaffer
Score: 0 – Not in results list.
Q: literary novel about a life-long friendship of two video game developers
A: Show answerTomorrow, and Tomorrow, and Tomorrow by Gabrielle Zevin
Score: 100 – Position: 1
Q: popular book on bronze-age collapse
A: Show answer1177 B.C. : The Year Civilization Collapsed by Eric H. Cline
Score: 100 – Position: 1
Q: buckbeak
A: Show answerHarry Potter and the Prisoner of Azkaban by J. K. Rowling
Score: 100 – Position: 1
Q: recent book about REM
A: Show answerThe Name of This Band Is R.E.M.: A Biography by Peter Ames Carlin
Score: 80 – Position: 2
Q: recent book on nuclear war
A: Show answerNuclear War: A Scenario by Annie Jacobsen
Score: 100 – Position: 1
Q: recent cartoon demon perspective book
A: Show answerThe Book of Bill (Gravity Falls) by Alex Hirsch
Score: 0 – Not in results list.
Q: romance where justin has a reddit curse
A: Show answerJust for the Summer by Abby Jimenez
Score: 0 – Not in results list.
Q: what is the sequel to Sunset of the Sabertooth
A: Show answerMidnight on the Moon by Mary Pope Osborne
Score: 100 – Position: 1
Q: graphic novel youth group fights demons
A: Show answerYouth Group by Jordan Morris
Score: 80 – Position: 2
Q: New book from author of fleishman is in trouble
A: Show answerLong Island Compromise by Taffy Brodesser-Akner
Score: 100 – Position: 1
Q: ya novel about young woman on pirate ship, mutiny
A: Show answerThe True Confessions of Charlotte Doyle by Avi
Score: 30 – Position: 7
Q: graphic novel boxer rebellion
A: Show answerBoxers and Saints by Gene Yuen Yang
Score: 100 – Position: 1
Q: Next book in sister holiday series
A: Show answerBlessed Water by Margot Douaihy
Score: 0 – Not in results list.
Q: last book in murakami rat series
A: Show answerDance Dance Dance by Haruki Murakami
Score: 100 – Position: 1
Q: Book about invisible woman whos brother is a suspect in a murder
A: Show answerThis Great Hemisphere by Mateo Askaripour
Score: 30 – Position: 19
Q: fantasy with linus baker
A: Show answerThe House in the Cerulean Sea by TJ Klune
Score: 100 – Position: 1
Q: girl finds necklace and meets pink bunny robot
A: Show answerSophie’s World by Jostein Gaarder
Score: 0 – Not in results list.
Q: New Arthurian epic
A: Show answerThe Bright Sword by Lev Grossman
Score: 100 – Position: 1
Q: that octopus friendship novel
A: Show answerRemarkably Bright Creatures by Shelby Van Pelt
Score: 80 – Position: 2
Q: orphan girl always looks on the bright side of things
A: Show answerPollyanna by Eleanor H. Porter
Score: 50 – Position: 5
Q: khan academy book about AI
A: Show answerBrave New Words: How AI Will Revolutionize Education and Why That’s a Good Thing by Salman Khan
Score: 100 – Position: 1
Q: Murder mystery about three foster sisters and a body found in their foster home
A: Show answerDarling Girls by Sally Hepworth
Score: 100 – Position: 1
Q: beginning chapter book mystery series about a girl with a photographic memory
A: Show answerCam Jansen and the Mystery of the Stolen Diamonds by David A. Adler
Score: 30 – Position: 7
Q: official biography of steve jobs
A: Show answerSteve Jobs by Walter Isaacson
Score: 100 – Position: 1
Q: book about a serial killer in chicago during the chicago world’s fair
A: Show answerThe Devil in the White City: Murder, Magic, and Madness at the Fair That Changed America by Erik Larson
Score: 100 – Position: 1
Q: book where girl eats manna from heaven
A: Show answerThe Wonder by Emma Donoghue
Score: 0 – Not in results list.
Q: 2024 national book award novel
A: Show answerJames by Percival Everett
Score: 100 – Position: 1
Q: zombie book with “hungries”
A: Show answerThe Girl With All the Gifts by M. R. Carey
Score: 100 – Position: 1
Q: Romance where she gets the expiration date of the relationships she starts
A: Show answerExpiration Dates by Rebecca Serle
Score: 100 – Position: 1
Q: martha ballard mystery
A: Show answerThe Frozen River by Ariel Lawhon
Score: 80 – Position: 2
Q: last book in dune caladan trilogy
A: Show answerDune: The Heir of Caladan by Brian Herbert
Score: 100 – Position: 1
Q: aliens arrive in medieval Germany
A: Show answerEifelheim by Michael Flynn
Score: 100 – Position: 1
Q: lesbian taxidermist in florida
A: Show answerMostly Dead Things by Kristen Arnett
Score: 100 – Position: 1
Q: book about witches in the suffrage movement
A: Show answerThe Once and Future Witches by Alix E. Harrow
Score: 100 – Position: 1
Q: hellboy rpg
A: Show answerHellboy Sourcebook and Roleplaying Game by Mike Mignola
Score: 100 – Position: 1
Q: Pre-civil war philadelphia maid and abolotionist girl help enslaved girl escape
A: Show answerAll We Were Promised: A Novel by Ashton Lattimore
Score: 0 – Not in results list.
Q: graphic novel memoir set in a funeral home
A: Show answerFun Home: A Family Tragicomic by Alison Bechdel
Score: 100 – Position: 1
Q: second tintin book
A: Show answerTintin in the Congo by Hergé
Score: 100 – Position: 1
Q: dear sugar book
A: Show answerTiny Beautiful Things: Advice on Love and Life from Dear Sugar by Cheryl Strayed
Score: 100 – Position: 1
Q: time travel bureaucracy in england
A: Show answerThe Ministry of Time by Kaliane Bradley
Score: 100 – Position: 1
Q: roadtrip to visit sites of political assassinations
A: Show answerAssassination Vacation by Sarah Vowell
Score: 100 – Position: 1
Q: second Molly american girl book
A: Show answerMolly Learns a Lesson by Valerie Tripp
Score: 80 – Position: 2
Q: book about woman who lives in sand pit
A: Show answerThe Woman in the Dunes by Kōbō Abe
Score: 100 – Position: 1
Q: taylor swift book be rolling stone reporter
A: Show answerHeartbreak Is the National Anthem: A Celebration of Taylor Swift’s Musical Journey, Cultural Impact, and Reinvention of Pop Music for Swifties by a Swiftie by Rob Sheffield
Score: 0 – Not in results list.
Q: sam bankman fried book
A: Show answerGoing Infinite: The Rise and Fall of a New Tycoon by Michael Lewis
Score: 100 – Position: 1
Q: irish novel with tightrope walk between Twin Towers
A: Show answerLet the Great World Spin by Colum McCann
Score: 100 – Position: 1
Q: children’s book with easter bunny mother
A: Show answerThe Country Bunny and the Little Gold Shoes by DuBose Heyward
Score: 100 – Position: 1
Q: the fifth frontiers saga book
A: Show answerRise of the Corinari (The Frontiers Saga, #5) by Ryk Brown
Score: 100 – Position: 1
Q: dystopian novel with big brother
A: Show answer1984 by George Orwell
Score: 80 – Position: 2
Q: historical murder mystery set in san francisco with a gay ex-cop
A: Show answerLavender House by Lev AC Rosen
Score: 0 – Not in results list.
Q: cerulean sea sequel
A: Show answerSomewhere Beyond the Sea by TJ Klune
Score: 100 – Position: 1
Q: Magical realism-western about Mexican man in Texas trying to save his family
A: Show answerThe Bullet Swallower by Elizabeth Gonzalez James
Score: 0 – Not in results list.
Q: literary fiction about the making of the Oxford English Dictionary
A: Show answerThe Dictionary of Lost Words by Pip Williams
Score: 60 – Position: 4
While doing the research for a future talk, I came across an obscure but impressively prophetic report entitled Accessibility and Integrity of Networked Information Collections that Cliff Lynch wrote for the federal Office of Technology Assessment in 1993, 32 years ago. I say "obscure" because it doesn't appear in Lynch's pre-1997 bibliography.
To give you some idea of the context in which it was written, unless you are over 70, it was more than half your life ago when in November 1989 Tim Berners-Lee's browser first accessed a page from his Web server. It was only about the same time that the first commercial, as opposed to research, Internet Service Providers started with the ARPANET being decommissioned the next year. Two years later, in December of 1991, the Stanford Linear Accelerator Center put up the first US Web page. In 1992 Tim Berners-Lee codified and extended the HTTP protocol he had earlier implemented. It would be another two years before Netscape became the first browser to support HTTPS. It would be two years after that before the ITEF approved HTTP/1.0 in RFC 1945. As you can see, Lynch was writing among the birth-pangs of the Web.
Although Lynch was insufficiently pessimistic, he got a lot of things exactly right. Below the fold I provide four out of many examples.
Page numbers refer to the PDF, not to the original. Block quotes without a link are from the report.
Page 66 |
The ultimate result a few years hence — and it may not be a bad or inappropriate response, given the reality of the situation — may be a perception of the Internet and much of the information accessible through it as the "net of a million lies", following science fiction author Vernor Vinge's vision of an interstellar information network characterized by the continual release of information (which may or may not be true, and where the reader often has no means of telling whether the information is accurate) by a variety of organizations for obscure and sometimes evil reasons.The Vernor Vinge reference is to A Fire Upon the Deep:
In the novel, the Net is depicted as working much like the Usenet network in the early 1990s, with transcripts of messages containing header and footer information as one would find in such forums.The downsides of a social medium to which anyone can post without moderation were familiar to anyone who was online in the days of the Usenet:
Usenet is culturally and historically significant in the networked world, having given rise to, or popularized, many widely recognized concepts and terms such as "FAQ", "flame", sockpuppet, and "spam".Earlier in the report Lynch had written (Page 23):
...
Likewise, many conflicts which later spread to the rest of the Internet, such as the ongoing difficulties over spamming, began on Usenet.:
"Usenet is like a herd of performing elephants with diarrhea. Massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it."
— Gene Spafford, 1992
Access to electronic information is of questionable value if the integrity of that information is seriously compromised; indeed, access to inaccurate information, or even deliberate misinformation, may be worse than no access at all, particularly for the naive user who is not inclined to question the information that the new electronic infrastructure is offering.This resonates as the wildfires rage in Los Angeles.
The parameter to this message gives a specification of charging schemes acceptable. The client may retry the request with a suitable ChargeTo header.the Web in 1993 lacked paywalls. But Lynch could see them coming (Page 22):
There is a tendency to incorrectly equate access to the network with access to information; part of this is a legacy from the early focus on communications infrastructure rather than network content. Another part is the fact that traditionally the vast bulk of information on the Internet has been publicly accessible if one could simply obtain access to the Internet itself, figure out how to use it, and figure out where to locate the information you wanted. As proprietary information becomes accessible on the Internet on a large scale, this will change drastically. In my view, access to the network will become commonplace over the next decade or so, much as access to the public switched telephone network is relatively ubiquitous today. But in the new "information age" information will not necessarily be readily accessible or affordable;The current RFC 9110 states:
The 402 (Payment Required) status code is reserved for future use.Instead today's Web is infested with paywalls, each with their own idiosyncratic user interface, infrastructure, and risks.
Now, consider a library acquiring information in an electronic format. Such information is almost never, today, sold to a library (under the doctrine of first sale); rather, it is licensed to the library that acquires it, with the terms under which the acquiring library can utilize the information defined by a contract typically far more restrictive than copyright law. The licensing contract typically includes statements that define the user community permitted to utilize the electronic information as well as terms that define the specific uses that this user community may make of the licensed electronic information. These terms typically do not reflect any consideration of public policy decisions such as fair use, and in fact the licensing organization may well be liable for what its patrons do with the licensed information.The power imbalance between publishers and their customers is of long standing, and it especially affects the academic literature. In 1989 the Association of Research Libraries published Report of the ARL Serials Prices Project:
The ARL Serials Initiative forms part of a special campaign mounted by librarians in the 1980s against the high cost of serials subscriptions. This is not the first time that libraries have suffered from high serial prices. For example, in 1927 the Association of American Universities reported that:The oligopoly rents extracted by academic publishers have been a problem for close on a century, if not longer! Lynch's analysis of the effects of the Web's amplification of this power imbalance is wide-ranging, including (Page 31):
"Librarians are suffering because of the increasing volume of publications and rapidly rising prices. Of special concern is the much larger number of periodicals that are available and that members of the faculty consider essential to the successful conduct of their work. Many instances were found in which science departments were obligated to use all of their allotment for library purposes to purchase their periodical literature which was regarded as necessary for the work of the department"
Very few contracts with publishers today are perpetual licenses; rather, they are licenses for a fixed period of time, with terms subject to renegotiation when that time period expires. Libraries typically have no controls on price increase when the license is renewed; thus, rather than considering a traditional collection development decision about whether to renew a given subscription in light of recent price increases, they face the decision as to whether to lose all existing material that is part of the subscription as well as future material if they choose not to commit funds to cover the publisher's price increase at renewal time.Thus destroying libraries' traditional role as stewards of information for future readers. And (Page 30):
Of equal importance, the contracts typically do not recognize activities such as interlibrary loan, and prohibit the library licensing the information from making it available outside of that library's immediate user community. This destroys the current cost-sharing structure that has been put in place among libraries through the existing interlibrary loan system, and makes each library (or, perhaps, the patrons of that library) responsible for the acquisitions cost of any material that is to be supplied to those patrons in electronic form. The implications of this shift from copyright law and the doctrine of first sale to contract law (and very restrictive contract terms) is potentially devastating to the library community and to the ability of library patrons to obtain access to electronic information — in particular, it dissolves the historical linkage by which public libraries can provide access to information that is primarily held by research libraries to individuals desiring access to this information. There is also a great irony in the move to licensing in the context of computer communications networks — while these networks promise to largely eliminate the accidents of geography as an organizing principle for inter-institutional cooperation and to usher in a new era of cooperation among geographically dispersed organizations, the shift to licensing essentially means that each library contracting with a publisher or other information provider becomes as isolated, insular organization that cannot share its resources with any other organization on the network.
we are now seeing considerable use of multi-source data fusion: the matching and aggregation of credit, consumer, employment, medical and other data about individuals. I expect that we will recapitulate the development of these secondary markets in customer behavior histories for information seeking in the 1990s; we will also see information-seeking consumer histories integrated with a wide range of other sources of data on individual behavior.He described search-based advertising (Page 61)
The ability to accurately, cheaply and easily count the amount of use that an electronic information resource receives (file accesses, database queries, viewings of a document, etc.) coupled with the ability to frequently alter prices in a computer-based marketplace (particularly in acquire on demand systems that operate on small units of information such as journal articles or database records, but even, to a lesser extent, by renegotiating license agreements annually) may give rise to a number of radical changes. These potentials are threatening for all involved.
The ability to collect not only information on what is being sought out or used but also who is doing the seeking or using is potentially very valuable information that could readily be resold, since it can be used both for market analysis (who is buying what) and also for directed marketing (people who fit a certain interest profile, as defined by their information access decisions, would likely also be interested in new product X or special offer Y). While such usage (without the informed consent of the recipient of the advertising) may well offend strong advocates of privacy, in many cases the consumers are actually quite grateful to hear of new products that closely match their interests. And libraries and similar institutions, strapped for revenue, may have to recognize that usage data can be a valuable potential revenue source, no matter how unattractive they find collecting, repackaging and reselling this information.Of course, it wasn't the libraries but Google, spawned from the Stanford Digital Library Project, which ended up collecting the information and monetizing it. And the power imbalance between publishers and readers meant that the reality of tracking was hidden (Page 63):
when one is accessing (anonymously or otherwise) a public-access information service, it is unclear what to expect, and in fact at present there is no way to even learn what the policy of the information service provider is.
by David. (noreply@blogger.com) at January 16, 2025 04:00 PM
This week, I'm going to tug on time. This follows the last item in last week's issue of Thursday Threads: The Clock that Made Power Grids Possible. Two years ago, I also published an issue about time, pointing to articles about eliminating the leap second, time standards on the moon, and observational humor on how we might explain our concept of time to aliens. That last one might form the thread that I tug on in the next issue because it treads on how whether our digital selves will stand the test of time.
This week:
Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page. If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.
The 8-minute video companion to the above article is great to watch, too. This is a marvel of engineering — synchronizing the clocks of a whole city through puffs of air traveling through pipes. This system—accurate to a minute—was just 35 years before the sub-second precision required to synchronize the power grid, as described at the end of last week's issue.
I first encountered this when setting up a Zoom meeting for colleagues in Kathmandu. While most countries neatly set their clocks to full hour offsets (or, as noted in the quote above, a half-hour offset), Nepal ticks to its own clock with a 5-hour and 45-minute offset from UTC. It's as if Nepal took a look at the standard time zones and said, "Why be ordinary when you can add a twist?" Imagine trying to schedule a call back home, perplexed as you reconcile not just the time difference but—and here's the kicker—those extra 15 minutes that make Nepal unique.
The Thursday Threads issue two years ago talked about the need to keep accurate on the moon. Following an announcement from the White House early in 2024 directing NASA to create a time standard for the moon, U.S., European, and Chinese efforts are underway to make that happen.
File this away for use at parties...
So let's talk about the third cat in the house (after Alan in the last issue and Mittens in the issue before). This is Pickle, a black-and-white Tuxedo cat with a drive for food that I've never witnessed in another cat. Two stories from one recent afternoon: First, when my wife got home from the grocery store, Pickle grabbed the bag of doughnuts from a canvas bag and made off with a big chunk of a long-john. Then, when she was fixing dinner, Pickle jumped on the counter and made off with a hunk of steak. My wife chased her around the dining room table, through the living room, and up the stairs to my daughter’s room. I rushed to follow, and we trapped Pickle between the headboard and the wall. My wife thinks the cat woofed down a sizable chunk of meat before we could catch her.
That, ladies and gentlemen, is Pickle.
I run some lightweight privacy-respecting self-hosted analytics for my blog, so I know what my most popular posts were in 2024. It's hardly surprising that many of these were also published last year, but they include one from 2013 and another from 2018. Having a quick peek at my stats reminded me that the blog content that is most appreciated, shared and read is often not what you might think and some posts retain value over time. My most popular posts last year include a couple of write-ups of conferences I attended, a conference talk I gave, a highly personal reflection on the biggest single-day wave of people moving from Twitter to Mastodon (which is far and away the most-read blog post I've ever written), as well as a few technical descriptions of how to do specific things, and a post I wrote 11 years ago about 3D printers.
What I personally appreciate a great deal is blog posts outlining exactly how some technical thing works, or a step by step description of how someone did something. This also happens to be some of the most consistently popular content on my own blog - the top post last year is something of no relevance to my day job and about a topic I am not really an expert in. But it explains step by step how I did something that a lot of people want to know how to do, so it's useful to the world.
I want to read more stuff like this - helpful tips from human beings who aren't trying to sell anything and aren't just posting to get a cheap reaction on social media. You should get a blog.
The great thing about having your own blog is that there are no rules you can write about whatever you want. I started off mostly throwing my uninformed opinions about librarianship into the void, but over the years I've written a lot of different things, just as the list above shows.
Liam's blog was originally a food blog but is now quite eclectic: brief commentary on the New South Wales planning scheme, observations about how emergency management differs between countries, notes about what he's reading, and the occasional nori roll recipe.
Ed mostly posts about technology but sometimes shares a whimsical photograph.
Julia specialises in incredibly detailed explanations of how various computer things work but sometimes she'll post about crochet patterns or how to write zines.
Jessamyn writes about whatever is on her mind which could mostly be described as "libraries and open culture" but covers a lot of ground.
Are things interesting to you? Did you learn something today? Congratulations, you have something to write about that will be interesting to someone else. I've written blog posts that I thought were well crafted and interesting, and have hardly any views. I've bashed out some half-arsed thoughts off the top of my head, and they've ended up being the most popular things I've ever published. Who knows, man? Just try not to defame anyone, and then put it out there.
Either do I, that's why I don't publish posts regularly. The same rule applies here as for what to write about - there are no rules! Liam publishes nothing for months, and then pumps out four posts in a week. I've had wildly different posting schedules over the years. Adam Mastroianni would rather trash his draft and publish nothing than post something he's not happy with just to keep a schedule. Ashley published one post a week for 39 weeks and then took five months off.
Ten years ago I wrote about how I got over my fears about my blog posts being used for corporate profit. I think the same applies to LLMs, even though they don't attribute their sources. If writing is how you make your livelihood then different rules apply.
Great, I can't read it on Facebook or LinkedIn, because they're enclosed spaces that require a login to read them. This is also why I strongly urge against using something like Medium, which isn't really a blogging platform since it requires logging in to read posts. Mastodon and other fediverse software and platforms are better, but they're not blogs. You think differently when you're posting longer form content, using a platform that's designed for that.
I recommend something that provides
All the suggestions below offer these.
This is not actually necessary, but I strongly suggest you set up your own domain name (e.g. example.com
) and set it to auto-renew so you don't accidentally lose it. Some webhosts provide domain registration as well, or you can do it separately. Somewhere like Gandi will get you started. Don't use GoDaddy.
Once you've done that, it all depends on how you prefer to do things, and what your budget is. Earlier this week I asked my Mastodon bubble for suggestions for first-time bloggers - thanks to everyone for your suggestions!
If you can't be bothered reading everything below or it's too hard to decide, get a WordPress blog with Reclaim Hosting.
USD$5 per month
I haven't used this myself, but Blot might be exactly what you're looking for. The demo on the website looks pretty impressive to me, and the price is attractive. You can use your own custom domain with Blot and it is fully managed for you. Once you've configured Blot, you publish by adding files and folders to a synced folder in Google Drive, Dropbox, or a Git repository, so you can use an application you already know to actually write your content, like MS Word or your favourite text editor.
FREE
If you're too young to remember Geocities, or old enough and are still mourning its demise, then Neocities might be for you. Neocities is designed to be the 2020s version of Geocities: you write raw HTML and the example sites look kinda out there and glitchy because that's the point.
~AUD$5-$20 per month
A lot of people recommended hosted WordPress as the best option for most people. See my note further down about WordPress.com and why I do not recommend WordPress.com as a host. Whilst at the time of writing you may hear that "the WordPress world is in turmoil right now", the reality is that this is extremely unlikely to impact most owners of hosted WordPress sites: the argument is within the WordPress developer community and however it is resolved, it's in everyone's interest for WordPress users to barely notice and it's a piece of openly-licensed software rather than a platform that can just be switched off.
Reclaim Hosting comes highly recommended by many people over time. They're focussed on higher education in the USA but anyone can sign up for a personal plan at very attractive pricing. This is probably the best option for most people.
If you want something based in Australia with local support, a couple of different people recommended VentraIP. This will be more expensive than Reclaim even after accounting for currency exchange rates.
There are many other options - look for "Hosted WordPress". Generally what you get is "shared hosting" with "CPanel", which means your blog will be in a separated section of a web server also hosting several other websites, and you can use a web interface to configure things like the domain you use for your blog. Your chosen host will usually have good documentation on how to get set up.
USD$9 per month
Ghost was originally a Kickstarter project by a former WordPress core developer, but has developed quickly from there. Ghost can be used for both websites (e.g. 404 Media) and newsletters (e.g. Mita Williams' University of Winds). Ghost takes the clean and simple markdown-based approach of static site generators but removes all the nerdy futzing so it's more like the WordPress experience. Indeed whilst writing in markdown was originally the only way to use Ghost, it now offers a rich WYSIWYG writing interface as well, so you can compare Ghost and hosted WordPress to see which one you prefer. I published this blog using self-hosted Ghost for a while.
FREE
Publii is an open source static site generator (see below for more on this), but you can connect it to a free GitHub or GitLab Pages account to publish. Interestingly, publishing and configuring Publii works as a desktop application rather than a web interface, which is a little different to most of the options listed here and makes it a lot simpler for normal people than a commandline based system like I describe below.
Blogger is a free service from Google. It's quite bare-bones and really geared towards posting content to attract people to view ads where you share the revenue with Google, so the primary use of Blogger is by spam-blogs. As a Google product you also never know when it will join the Google graveyard. But if you're looking for something basic and free, Blogger was nominated by a couple of people in my unscientific survey, and you will be joining successful and interesting bloggers like Aaron Tay.
FREE to ~ $USD10
If you're keen to have more control, you can look into using a static site generator (SSG). An SSG is essentially a commandline script that takes a bunch of input files and outputs a website - HTML files in directories, with all the relevant images, CSS and JavaScript and everything pointing to the right place. Different SSGs use different templating languages, but pretty much all of them use markdown in the page content file and convert it into HTML using an appropriate template.
My blog is made using Zola, but I've previously used Eleventy. To publish with an SSG you either need to use GitLab Pages (which works with most SSGs) or GitHub Pages (which only works with the Jekyll SSG); or have control over some space on some kind of webserver - either shared hosting (something with CPanel), or a standalone virtual private server (VPS). There's a bit of technical work involved to publish this way, so it's not surprising that a great many blogs published with SSGs start off with a couple of posts about how they set up their blog, and sometimes end there. If you want to procrastinate with your SSG setup instead of writing blog posts, this could be a great choice.
The WordPress world is currently experiencing some difficulties, after one of the original creators of WordPress, and owner of WordPress.com, Matt Mullenweg, seems to have taken leave of his senses. His behaviour has been so erratic over the last month that I cannot recommend using his company (WordPress.com/Automattic) to host your blog. I probably wouldn't have recommended this anyway, as I think Automattic is pretty aggressive at upselling to unsuspecting new users. Since the WordPress software is openly licensed, anyone else can use it and provide hosting for you, as I outlined above. The software itself is very robust and several of the other software options I suggest provide exports using the WordPress XML export standard.
Wix is the Yahoo Mail of blogging platforms, with a laggy, busy interface that is constantly upselling to you. It also doesn't provide an export function - if you start a Wix site you're essentially stuck paying Wix until they go bankrupt and your blog is deleted forever.
Squarespace was recommended to me as a good option that "just works" when I asked for suggestions on Mastodon. Squarespace does provide exports using the WordPress xml standard. At AUD$16 per month I don't consider Squarespace a good deal compared to the nearest alternative of hosted WordPress - it's not open source so the only host you can use for a Squarespace blog is Squarespace (although you can export your blog in the WordPress XML format to take it somewhere else). You're also at the mercy of Squarespace's corporate strategy.
Once you've set up your blog, you can add it to the list at ausglamr.newcardigan.org. Then every time you publish a blog post, it will be shared with the GLAMR world. You can add certain tags to your post if you don't want a particular post to be added to the Aus GLAMR feed.
Now get blogging!
Archives have never collected everything, but everything can become archival.
This was a somewhat random grandiose utterance during a conversation today about social media archiving and the Records Continuum while thinking about Suzanne Briet.
Lucidworks AI empowers businesses to seamlessly integrate, manage, and optimize generative AI, driving innovation and efficiency while ensuring accuracy and responsible use.
The post Meet Lucidworks AI, the AI orchestration engine for search appeared first on Lucidworks.
It is time for another roundup of topics in storage that have caught my eye recently. Below the fold I discuss the possible ending of the HAMR saga and various developments in archival storage technology.
Seagate's 2018 HAMR roadmap |
Seagate has set a course to deliver a 48TB disk drive in 2023 using its HAMR (heat-assisted magnetic recording) technology, doubling areal density every 30 months, meaning 100TB could be possible by 2025/26. ... Seagate will introduce its first HAMR drives in 2020. ... a 20TB+ drive will be rolled out in 2020.So in a decade the technology had gone from next year to the year after next. The year after next Jim Slater wrote HAMR don’t hurt ’em—laser-assisted hard drives are coming in 2020:
Seagate has been trialing 16TB HAMR drives with select customers for more than a year and claims that the trials have proved that its HAMR drives are "plug and play replacements" for traditional CMR drives, requiring no special care and having no particular poor use cases compared to the drives we're all used to.But no, it would be another four years before we saw the first signs of HAMR drives in the market. In December 2024 Matthew Connatser reported that Seagate launches 32TB Exos M hard drive based on HAMR technology – Mozaic 3+ drives are the world’s first generally available HAMR HDDs:
Seagate’s biggest-ever hard drive is finally here, coming with 32TB of capacity courtesy of the company’s new HAMR technology (via Expreview).Note that the drives that are "(nearly) here" are still not available from Amazon, although they are featured on Seagate's web site. Kevin Purdy writes:
It has almost been a year since Seagate said it had finally made a hard drive based on heat-assisted magnetic recording (HAMR) technology using its new Mozaic 3+ platform.
...
Exos drives based on Mozaic 3+ were initially released to select customers in small quantities, but now the general release is (nearly) here, thanks to mass production.
Drives based on Seagate's Mozaic 3+ platform, in standard drive sizes, will soon arrive with wider availability than its initial test batches. The driver maker put in a financial filing earlier this month (PDF) that it had completed qualification testing with several large-volume customers, including "a leading cloud service provider," akin to Amazon Web Services, Google Cloud, or the like. Volume shipments are likely soon to follow.More indications that volume shipments could happen "next year" comes from Chris Mellor's WD’s HAMR switch could be closer than we think:
There is no price yet, nor promise of delivery, but you can do some wishful thinking on the product page for the Exos M, where 30 and 32TB capacities are offered. That's 3TB per platter, and up to three times the efficiency per terabyte compared to "typical drives," according to Seagate.
Intevac has said there is strong interest in its HAMR disk drive platter and head production machinery from a second customer, which could indicate that Western Digital is now involved in HAMR disk developments following Seagate’s move into volume production.
Intevac supplies its 200 Lean thin-film processing machines to hard disk drive media manufacturers, such as Seagate, Showa Denko and Western Digital. It claims more than 65 percent of the world’s HDD production relies on its machinery. The Lean 200 is used to manufacture recording media, disk drive platters, for current perpendicular magnetic recording (PMR) disks.
Intevac’s main customer for HAMR-capable 200 Lean machines is Seagate, which first embarked on its HAMR development in the early 2000s. It is only this year that a prominent cloud service provider has certified Seagate’s Mozaic 3 HAMR drives for general use, more than 20 years after development first started. The lengthy development period has been ascribed to solving difficulties in producing drives with high reliability from high yield manufacturing processes, and Intevac will have been closely involved in ensuring that its 200 Lean machines played their part in this.
Surprisingly, with no special storage precautions, generic low-cost media, and consumer drives, I'm getting good data from CD-Rs more than 20 years old, and from DVD-Rs nearly 18 years old.The market for DVD-R media and drives is gradually dying because they have been supplanted in the non-archival space by streaming, an illustration that consumers really don't care about archiving their data!
Engineers, your challenge is to increase the speed of synthesis by a factor of a quarter of a trillion, while reducing the cost by a factor of fifty trillion, in less than 10 years while spending no more than $24M/yr.The only viable market for DNA storage is the data-center, and the two critical parameters are still the write bandwidth and the write cost. As far as I'm aware despite the considerable progress in the last 6 years both parameters are still many orders of magnitude short of what a system would have needed back then to enter the market. Worse, the last six years of data center technology development have increased the need for write bandwidth and reduced the target cost. DNA storage is in a Red Queen's Race and it is a long way behind.
The markedly expanding global data-sphere has posed an imminent challenge on large-scale data storage and an urgent need for better storage materials. Inspired by the way genetic information is preserved in nature, DNA has been recently considered a promising biomaterial for digital data storage owing to its extraordinary storage density and durability.The paper attracted comment from, among others, The Register, Ars Technica and Nature. In each case the commentary included some skepticism. Here are Carina Imburgia and Jeff Nivala from the University of Washington team in Nature:
However, there are still challenges to overcome. For example, epigenetic marks such as methyl groups are not copied by the standard PCR techniques used to replicate DNA, necessitating a more complex strategy to preserve epi-bit information when copying DNA data. The long-term behaviour of the methyl marks (such as their stability) in various conditions is also an open question that requires further study.You have to read a long way into the paper to find that:
Another challenge is that many applications require random access memory (RAM), which enables subsets of data to be retrieved and read from a database. However, in the epi-bit system, the entire database would need to be sequenced to access any subset of the files, which would be inefficient using nanopore sequencing. Moreover, the overall cost of the new system exceeds that of conventional DNA data storage and of digital storage systems, limiting immediate practical applications;
we stored 269,337 bits including the image of a tiger rubbing from the Han dynasty in ancient China and the coloured picture of a panda ... An automatic liquid handling platform was used to typeset large-scale data at a speed of approximately 40 bits s−1This is interesting research but the skepticism in the commentaries doesn't exactly convey the difficulty and the time needed to scale from writing less than 40KB in a bit under 2 hours, to the petabyte/month rates (about 2.8TB every 2 hours) Facebook was writing a decade ago. This would be a speed-up of nearly 11 orders of magnitude to compete with decade-old technology.
The research, published in Nature Photonics, highlights that the breakthrough extends beyond density. It is said to offer significant improvements in write times – as little as 200 femtoseconds – and lives up to the promise that "a diamond is forever" by offering millions of years of allegedly maintenance-free storage. Diamonds are highly stable by nature and the the authors have claimed their medium could protect data for 100 years even if kept at 200°C.These researchers, like so many others in the field, fail to understand that the key to success in archival storage is reducing total system cost. Long-lived but expensive media like diamonds are thus counter-productive.
High-speed readout is demonstrated with a fidelity of over 99 percent, according to the boffins.
Scientists have been eyeing diamonds as storage devices for a while. Researchers at City College of New York in 2016 claimed to be the first group to demonstrate the viability of using diamond as a platform for superdense memory storage.
by David. (noreply@blogger.com) at January 13, 2025 08:47 PM
LibraryThing is pleased to sit down this month with poet and book publicist Kim Dower, who has worked with authors from Kristin Hannah to Paolo Coelho through her freelance literary publicity company, Kim-from-L.A. The City Poet Laureate of West Hollywood from October 2016 – October 2018, she is the author of five previous collections of poetry, including the bestselling I Wore This Dress Today for You, Mom (2022), which was praised by The Washington Post as a “fantastic collection.” Her first collection, Air Kissing on Mars (2010), was praised by the Los Angeles Times as “sensual and evocative… seamlessly combining humor and heartache.” Her work has appeared in literary publications such as Plume, Ploughshares, Rattle, The James Dickey Review, and Garrison Keillor’s “The Writer’s Almanac.” Her newest book, What She Wants: Poems on Obsession, Desire, Despair, Euphoria, will be published later this month by Red Hen Press. Dower sat down with Abigail to answer some questions about her work, and this new book.
What She Wants is your sixth poetry collection, and addresses the theme of obsessive love. What was the inspiration behind the book? Did it begin with a specific poem, a personal experience you wanted to explore, or something else?
I was reading an article (can’t remember where!) and came upon the word “Limerence.” I thought it was a beautiful sounding word, and it’s meaning, the state of being obsessively infatuated with someone, usually accompanied by delusions of or a desire for an intense romantic relationship with that person, fascinated me! I became obsessed with a word that meant to be obsessed! I realized I had many finished poems and many in the works that fit into this category, so I built a collection based on this idea and the four stages of limerence: infatuation, crystallization, deterioration and ecstatic release.
What makes poetry unique, as a form of literary expression? Is it just the structure that makes it different from prose, or does it communicate in different ways?
Because poetry is the most concise form of language, good poems will stir our emotions with a clarity and intensity that immediately takes hold in the reader. There’s an emotional honesty in poems that connects poet to reader to create a shared experience. It has been said that prose is like walking and poetry is like dancing. A single, short poem has the power to simultaneously comfort and terrify. The poet W.H. Auden says, “poetry is the clear expression of mixed feelings,” and this is true for the poet as she writes and the reader as well.
Can you tell us a little bit about your writing process? How does a poet begin a poem?
I don’t know how all poets begin a poem, but I begin one after being stirred or moved by something, something personal or something I’ve read or overheard. Or something I think is funny. I often read a news headline or hear something on the radio as I’m driving that immediately says THIS IS A POEM! I was once driving, listening to the local news, and the headline, talking about a new public school decision was, “They’re Taking Chocolate Milk Off the Menu!” I pulled over and wrote a poem with that title. Later, after it was published, Garrison Keillor read it on “The Writer’s Almanac.” Poems are everywhere and I use everything I see and hear as a prompt – whether it’s something whimsical that strikes me, or something more profound like hearing a dead parent speak to me.
How has working with so many different authors, through your activities as a publicist, affected your writing?
The only way working hard at a “day” job has affected my writing is I’m very focused when I sit down to write. I’ve learned how to separate the two kinds of work and my brain and mind like knowing and appreciate the difference!
You were Poet Laureate of the city of West Hollywood for two years. What sort of things did you do as a poet laureate?
It was so much fun creating different activities, readings and events and introducing people to poetry who otherwise never thought about it. My favorite project was creating a collaborative poem with people in the city. The City of West Hollywood is committed to the arts and supported all of my ideas. We designed a large pad with three prompts and I spent a few months asking strangers at local bookstores, cafes, parks, to participate in reading a prompt and writing some lines. People really enjoyed it and I created a powerful poem consisting of all their lines called, “I Sing the Body West Hollywood.” We made posters. We celebrated!
Who are some of your favorite poets, and how has their work influenced your own?
I have so many favorites and so many whose work has influenced my own. More than influence – whose work has given me permission to build my own voice. I love Frank O’Hara – New York School of Poets – who’s influenced my “conversational” often breezy style while still packing a punch! William Carlos Williams, whose poetry has taught me to strive to make each poem a “fine machine.” Erica Jong, Sharon Olds and Kim Addonizio, for their passion, beauty, perceptions; Thomas Lux, Ron Padgett, Stephen Dunn, for humor mixed with deep emotion and insight. W.H. Auden for his style. This list could go on and on.
Tell us about your library. What’s on your own shelves?
I have hundreds and hundreds of books! I love all kinds of fiction, biographies, memoir, but upstairs, in my “Poetry Palace” I have only poetry – books I’ve kept and carried for 50 years – from college through today. I have a marvelous collection from Shakespeare to contemporary poets. Occasionally, just to calm myself, I will sit on the floor and take a random book off the shelf, read one or two poems, and place it back. This morning, for example, it was Diane di Prima’s book, The Poetry Deal. I read from it aloud. Now I can go on with my day.
What have you been reading lately, and what would you recommend to other readers?
I’m re-reading Vivian Gornick’s amazing, gorgeous memoir, Fierce Attachments, about her relationship with her mother. It’s a classic and each time I read it I discover something else – not only about her – but about myself.
I’m also re-reading Savage Beauty: The Life of Edna St Vincent Millay, a great poet and a fascinating star of poetry.
My poet friend, Nina Clements – who was also a Librarian – sent me a book called Monsters by Claire Dederer, which I’m enjoying, about the link between genius and monstrosity. How do we balance our love of some artists knowing the awful things they’ve done. This is a subject that constantly fascinates me.
And I’m slowly reading and loving the poems in Kim Addonizio’s new collection, Exit Opera.
Continuing the review of 2024, the following summaries the activities of the NDSA Interest and Working Groups activities. Please have a look at NDSA’s accomplishments – and feel free to reach out to NDSA with any questions on how you can get involved!
For the year, the Content Interest Group met quarterly on the first Thursday of the month at 12:00pm EST. We identified topics of interest through an ongoing but now defunct jamboard! We held 4 meetings utilizing various formats to facilitate the exchange of information. In February we held a Content Exchange about how your organization manages the access levels of digital content in your reading rooms, physical and virtual. Due to the success of the first Content Exchange, we held another one in May on how your organizations are increasing representation or including under-represented groups in your collections. In August, we switched it up with presentations by metadata experts and discussion on understanding metadata standards and ways to incorporate them in our work. Julie Shi, Digital Preservation Librarian, Scholars Portal, University of Toronto Libraries, discussed METS. Leslie Johnston, Director of Digital Preservation, U.S. National Archives and Records Administration discussed PREMIS. We winded the year down with our final meeting in November with a discussion using content to show impact of preservation or the risk of loss.
For the Infrastructure Interest Group, 2024 began with a discussion led by the founders of the AEOLIAN Network, a project whose focus was “to investigate the role that AI can play to make born-digital and digitised cultural records more accessible to users.” Its outcomes included multiple workshops, case studies and journal publications, all of which focused on the larger community’s use of AI in this space. During its next two quarterly meetings, group members presented on their unique requirements and solutions surrounding content staging areas and repository ingest workflows. We listened to in-depth descriptions of workflows from the University of Alabama, Birmingham, the University Libraries at Ohio State University and finally the UW Digital Collections Center, University of Wisconsin, Madison. Our final meeting of the year introduced the Internet Archive’s Vanishing Culture: A Report on Our Fragile Cultural Record as a reading selection, from which several thought-provoking essays were brought forward for discussion by group members.
The Standards & Practices Interest Group met quarterly on the first Monday of the month at 1:00 p.m. Eastern. Our topics for this year included: Digital preservation system migration; The language of the cloud; Selection for preservation; and Persistent identifiers and preservation. Michael Dulock presented his experience with migrating preservation systems at UC-Boulder for our January meeting. Our subsequent meetings were member-driven discussions. By far the most engaging and well-attended discussion was our exploration of “the language of the cloud” at our April 1, 2024 meeting. We shared experiences with outsourcing infrastructure to major cloud-based vendors (AWS, Azure, etc), and how that has impacted our preservation practices. Out of this discussion, we formed a sub-group to develop a survey on cloud-based infrastructure practices across the membership, which included members of the Infrastructure Interest Group. With the release of the NDSA Storage Survey, we will resume work on the Cloud Services sub-group in 2025, with a follow-up discussion scheduled for our first meeting S&P IG on January 13, 2025.
The Communications and Publications Working Group (CAPs) worked with the Coordinating Committee and chairs of Interest and Working Groups to update internal documentation and website content and publish blog posts and reports. CAPs works with survey Working Groups to edit and publish the reports, sometimes working on statistical analysis quality assurance. This year CAPs developed additional guidelines around accessibility for report preparation.
The Climate Watch Working Group had a productive year establishing our workflows, clarifying our publication criteria and objectives, and setting up our publication platform. We hope to release our inaugural publication early in 2025, so keep an eye out for announcements!
Beginning in Spring 2024, the Events Strategy Working Group (ESWG) focused on a framework for operationalizing recommendations from a previous planning group. They held monthly meetings, with breakout discussions that focused on working groups working on plans for a National Conference and Designated Communities. Despite some uncertainties about NDSA’s organizational affiliation, ESWG plans to deliver three key items by April 2025: (1) a charge for a standing Events Steering Committee to manage NDSA’s overall events strategy and serve as a liaison to annual conference committees; (2) an annual meeting toolkit with recommendations for both in-person and online events (with in-person events to resume in 2027 and coincide with NDSA Excellence Awards), and an action plan to encourage local and regional communities of practice with affiliated organizations and institutions. This action plan will define a process for developing an experts list and a speaker’s bureau to support digital preservation activities and workshops as well as a mechanism for endorsing digital preservation panels at affiliated events.
In 2024, the Excellence Awards Working Group utilized the year without an awards cycle to promote EAWG through blogs and video clips. Blogs were published via SAA’s bloggERS and the NDSA blog. Video clips were uploaded to the NDSA YouTube channel, and an Excellence Awards playlist was created to group them. Further blogs and clips are scheduled into 2025.
In addition, EAWG co-chairs drafted an Overview and Guidelines for the EAWG, which has been reviewed by the Communications and Publications Working Group. The EAWG projects finalizing this document in 2025. Finally, Jessica Venlet accepted the position of EAWG Co-chair for 2025-2027.
This year a new Levels Revision Working Group was formed with a remit to carry out a focused review of the Levels looking specifically at the environmental impact of the Levels. Look out for further news on this work in 2025.
This year we have run four Open Sessions: in January we held a general Q&A on the Levels, in April we focused on the Curation Guide, in July the topic was the Assessment Tool and October’s session provided a general introduction to the Levels aimed at those who hadn’t used them before. The Levels Steering Group also gave a presentation on Levels at the Virtual DigiPres conference.
We have seen a few changes on the Levels Steering Group, welcoming new members Rebecca Fraimow, Elizabeth La Beaud and Keith Pendergrass. Karen Cariani left the Steering Group to focus on other priorities and we thank her for all her hard work.
The Membership Working Group worked on a new process for onboarding new members to NDSA, and produced a comprehensive report with six detailed proposals designed to increase engagement of members. One of these proposals focused on having a standing Membership Working Group, who would provide onboarding and ongoing membership support. This group will launch in January 2025.
The working group published the 2023 storage infrastructure report in October, along with anonymized survey results, the survey codebook, and a crosswalk between the 2019 and 2023 survey questions. Working group members presented the survey results at iPRES, DLF, and SAA’s Research Forum.
The post NDSA Interest and Working Groups 2024 Year in Review appeared first on DLF.
I'm about halfway through Saul Griffith's 2021 Electrify: An Optimist's Playbook for Our Clean Energy Future, and I find the author makes a compelling point about bringing nearly everything—energy creation, transmission, and use—to a common factor of "electricity" and then optimizing that system. There are many interesting problems to solve, but they seem solvable.
In last week's Thursday Threads, I touched on how data centers impact the electrical grid. This week's issue looks further into how electricity is generated and distributed. The first article reflects back on the data center topic—it could have just as easily gone in last week's issue. Then there are a few other articles on the generation, storage, the flip away from carbon-based fuels, and a look at history.
This week:
Those are in addition to last week's:
Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page. If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.
The New York Times publishes this in-depth piece about the boom time for commercial electricians (or anyone who wants to train to become one). Data centers require substantial electrical power to support the high computing needs of artificial intelligence and the storage to save your New Year's Eve photos (as well as the power to run the cooling systems for those computers). Although AI has propelled the construction of data centers to a sharper slope, significant building and expansion projects were already underway. This article is a view at the intersection of traditional construction/labor, technology, land use, and economic growth.
Electricity is unique in that the providers must exactly match the demand at every moment. Excess generation capacity must be removed from the grid...it is just as bad as too little electricity. (Storage of excess electricity is a topic all its own; see below.) In Australia, the rapid growth of solar power generation is making it difficult for the grid operator to achieve that balance. Rooftop solar is great, but having that energy dumped uncontrolled back onto the grid causes instability.
(That isn't the only problem on the grid...there are devices that, as Grady of Practical Engineering says, "force the grid to produce power and move it through the system, even though they aren’t even consuming it.")
The problem with variable sources like solar and wind is the need for a baseline supply of always-there electricity. Coal, natural gas, and nuclear are good at meeting that baseline power need. Tidal systems are a clean, constant source of energy as well.
In addition to managing the supply, there also needs to be advancements in managing the demand side. Businesses already do this...their flexibility to reduce their electricity usage during high-demand events results in cheaper electricity rates because the utility doesn't need to build as much capacity just-in-case. This kind of variable pricing is also available to some homeowners. However, technology on the grid can help support this as well. This article talks about a scooter battery charging company that automatically takes equipment offline when generation capacity unexpectedly drops. Imagine this same sort of grid intelligence available for e-vehicle charging stations as well.
Solar panels only produce power when the sun is out, and wind turbines only produce power when the wind blows. We will need a way to store energy during times of overproduction and send it out to the grid when demand requires it. Many technologies are being explored to use excess energy to pump water uphill or spin a heavy flywheel. The technique in this article raises weights in a deep mine shaft to store energy.
Another potential storage solution is compressed air. All of these systems have trade-offs of expense versus capacity versus location requirements and other factors. Some of these experiments will succeed, and some won't be commercially viable.
With new generation and storage technologies, where does that leave the traditional burning-carbon-based tools? Fortunately, not long for this world.
It would seem that the momentum away from burning carbon fuels is well established. I hope it is established enough to deal with the instability that could be caused by the incoming U.S. federal administration.
Before there was a grid, there were many isolated islands of power generation. The "alternating" part of "alternating current" meant that these islands couldn't be connected until the cycles of alternation could be synchronized. We take 60-cycles-per-second for granted now, but it wasn't always this way.
This has become Alan's routine in the morning. It is far too cold—and now far too snowy—to work outside on the patio. So Alan sleeps through the long winter days on my keyboard numeric pad until spring.
An accessibility violation found in our Staff Directory during baseline testing for the library website.
In Brief This paper employs autoethnography to expose the conference experiences of disabled scholars within the academic and library fields, highlighting the systemic barriers found in these professional settings. In integrating personal narratives with theoretical insights, this study highlights how rigid conference spaces and norms do not accommodate disabled bodyminds, which hinders professional development, and highlights the need for systemic changes. The barriers found include the mental load of navigating inaccessible spaces, the extra financial costs required for participation, and interference between the clock time of conferences and the crip time of disabled bodyminds. While conferences provide challenges, we also find them to be a place for crip connections, as seen in the authors’ friendship that provides both emotional and professional support. This paper concludes with theoretical implications for making conferences more accessible and the disentanglement of libraries and academia from their ideas of productivity and ideal workers.
By Rhys Dreeszen Bowman and Leah T. Dudak
Rhys Dreeszen Bowman (they/them) is a white, queer, nonbinary Ph.D. candidate at the University of South Carolina. They are a former high school librarian in rural New England. They have a middle-class family background and are also privileged in their whiteness, access to education, and status as a citizen. Rhys is physically disabled and chronically and mentally ill. Rhys is sometimes invisibly disabled and sometimes uses a mobility aid. They often can conceal their disability and pass as nondisabled, which reduces the ableism they experience. Passing as nondisabled also affects their ability to meet their accessibility needs and navigate the world.
Leah T. Dudak (she/her) is a white, cis, female, fat, librarian, and Ph.D. student at Syracuse University. She is disabled with diagnoses of fibromyalgia, dysgraphia, anxiety, and depression. She sees her body as disabled, but her disabilities are often invisible, so people cannot outwardly see them unless she uses a mobility aid or discloses her disability. Meanwhile, her body also carries immense privilege of sometimes passing as nondisabled, as well as the privileges of race, sexuality/gender, education, and socioeconomic status.
We use the term bodymind because we consider the physical body and mind inseparable and to act in concert (Price, 2011). We resist the Western assumption that the body and mind are distinct and the privileging of the mind over the body (Clare, 2017). We also use the term crip to move our conversation outside the insular walls of academia and link our work to disability justice activists and the tangible lives of disabled people, moving beyond a disability rights movement that is primarily concerned with helping white men integrate into mainstream society as productive citizens (Hamraie, 2017). We align ourselves instead with the disability justice revolution, built by queer and trans Black, Indigenous, people of color (BIPOC) activists, that fights for the liberation of all sick/unwell/mad/neurodivergent/crip bodyminds.
We add to the tradition of disability scholars attuning to the role of disability and resistance in academia. In Activist Affordances, Dokumaci (2023) uses visual ethnography to chronicle the lives of invisibly disabled people related to arthritis and other inflammatory diseases. Dokumaci attunes to what she calls “unnoticed choreographies,” or the everyday actions disabled people take to move through the world (p. 2). The author calls activist affordances the “performative microacts/arts through which disabled people enact and bring into being the worlds that are not already available to them, the worlds they need and which to dwell in” (pp. 2-3). She posits that disabled futures already exist and activist affordances are “outposts” of these worlds. We draw on Dokumaci’s work to frame our disabled resistance within academia and the possibilities we find for disabled futurities.
In The Undercommons: Fugitive Planning & Black Study, Harney and Moten (2013) examine how institutions such as the university impede our ability to empathize and our capacity to love. They call the undercommons a “maroon community” of teachers and students that “refuse to ask for recognition and instead want to take apart, dismantle, tear down the structure that, right now, limits our ability to find each other, to see beyond it and to access the places we know lie beyond its walls” (p. 6). They argue that the university is designed to uphold capitalism, and the university creates a free labor source to benefit the State. We borrow from Harney and Moten’s understanding of the university as an agent of capitalism, extending their argument to the way the neoliberal values of academia limit and harm disabled academics.
Additionally, this work is adding to already existing literature around the inaccessibility of conferences such as Manwiller’s (2019) article highlighting how conferences can often be the hardest part of being an academic and harder than any other professional experience. In her follow-up critique of the 2021 ACRL conference, she writes: “At the time, I thought it was my responsibility to adapt to the conference setting if I wanted to be professionally active” (2021). Finally, she questions why library organizations act and conduct business as if disabled workers do not exist. Below, we expand upon Manwiller’s articles and discuss this tension between participating in a conference and creating our own space, but also demanding to be accommodated in the rest of this work.
Price (2011) challenges the academic assumption that disability affects the body while the mind remains untouched by illness. She pushes understandings of disability and education practices to include mental disability. Price challenges readers to reconsider ableist academic values such as productivity and independence and how they damage disabled people. Andersen (2024) uses autoethnography to interrogate how disability impacts the author’s experience as a librarian. Anderson notes how writing in the field of Library and Information Science (LIS) focuses on how libraries can serve patrons with disabilities, ignoring the possibility that librarians could themselves be disabled.
Autoethnography empowers us to share our experience and, in doing so, speak back to the silence that frames illness as private and unspeakable, which allows us to open a conversation about what it means to participate in librarianship and academia in bodyminds that are devalued and excluded. We argue that autoethnography is a crip methodology. When one’s own bodymind becomes the site of research, there is no need to travel or schedule to conduct interviews, which can tax the body and ignores our need for sudden rest. Conducting research on ourselves allows us to research as it suits our bodies. We work when we are able and rest when we need to. Crip autoethnography also values disabled stories as knowledge that is worthy of study (Richards, 2008; Kasnitz, 2020). We position ourselves as both subjects and objects, and through storytelling construct knowledge (Ellis, 2004). In this, we are not creating universal truths but sharing just as valid individual ones, which can still create disruption and push change.
We use autoethnography to connect the personal and political and “show how stories become the change we want to see in the world” (Holman Jones & Harris, 2019). Our research addresses the injustice within librarianship, academia, and within our own research (Madison, 2012). By analyzing our stories, we advance knowledge and offer possibilities for new practices. Using Kafer (2013) as a model, we tie crip autoethnography to queer autoethnography to focus on subjugated knowledges. Queering autoethnography speaks truth to power and in “that speaking enacts new worlds. Not just records them” (Holman Jones & Harris, 2019, p. 64).
Additionally, oftentimes, disability and chronic illness are written about from a medicalized point of view to inform caregivers or medical professionals. However, in this, the voice of the patient/client/disabled individual is often ignored (Piepzna-Samarasinha, 2018). As such, disabled bodies become something that is worked on, rather than an active participant in care. Autoethnography allows us to say in our own words our experiences, needs, wants, and desires without our stories then being filtered through a medicalized lens. This filtering erases our voice, but through autoethnography, we fully claim it, appreciate ourselves as experts, and analyze it in our own terms (Kasnitz, 2020; Richards, 2008). While many critics of autoethnography may see this closeness as a flaw, we see it as a reclamation of voice and story.
We also acknowledge that there is no one disabled experience, which is why we are writing together in some sections (such as this one) and also separately, even though we are both academics who share a discipline. We are both insiders and outsiders to each other (Richards, 2008). In using autoethnography, we give voice to our similarities and differences, which allows us to open a world of possibilities.
We frame our experience of conference attendance through the lens of crip time, popularized by Alison Kafer (2013) in her book Feminist Queer Crip, who uses crip time to discuss the way time operates differently for queer and disabled people. Disabled people move at a slower (or faster) space than normative society and might need more time to complete tasks or may be late to meetings due to navigating an inaccessible physical world. It is in direct conflict with normative[1] or clock time. We lose time to doctor visits, surgeries, and days spent in bed. Simple acts often result in the need for time to rest and recover. The adage that everyone has the same 24 hours crumbles upon an examination of disabled relationships with time. Kuppers calls crip time a “temporal shifting” (2014, paragraph 2), a way we slip out of place with clock time or the time at which normative society functions. Clock time is the time zone of neoliberal capitalism, while crip time belongs to the land of the ill. Samuels (2017) writes that
crip time is time travel. Disability and illness have the power to extract us from linear, progressive time with its normative life stages and cast us into a wormhole of backward and forward acceleration, jerky stops and starts, tedious intervals and abrupt endings (paragraph 5).
Crip time pushes disabled people out of time. The challenge and impossibility of operating on clock time is invisibilized and often overlooked when considering accessibility. Kafer suggests, “rather than bend disabled bodies and minds to meet the clock, crip time bends the clock to meet disabled bodies and minds” (2013, p. 27). Crip time is then a crucial feature of disabled experiences and must be considered when creating spaces welcoming of all bodyminds.
Chronopolitics is another way of understanding time as intrinsically tied to political behavior. Since bodies are inherently political, especially disabled ones, how disabled bodies move in time becomes a political behavior. The modern concept of time as we understand it was “discovered” in the fifteenth century, partly in order to frame new modern ideas of societal and individual progress. This conception of time led to ideas of human destiny, which motivated colonial projects across Europe and The United States (Toulmin & Goodfield, 1965). Time itself is socially constructed through political processes (Becker, 2019). Chronopolitics creates and maintains clock time. Within a university setting, Zembylas (2023) argues that chronopolitics is the “affective milieus”—how everyday interactions between individuals and their surroundings are imbued with power dynamics–within institutions of higher learning, as he:
…draws on existing studies in neoliberal academia to argue that changing academics’ affective habits created by dominant time discourses and practices requires the disruption of affective milieus in which time is channeled, routed and molded (2023 p. 493).
Zembylas proposes the liberatory potential of disrupting the chronopolitics of neoliberal academia. Within academia, chronopolitics functions as a hyperfocus on speed, efficiency, productivity, and manageability. Chronopolitics “…refers to the politics of time governing academic knowledge generation, epistemic entities, and academic lives and careers, as well as academic management processes more broadly speaking” (Felt, 2017, p. 54). In their autoethnography, Isaacs (2020) discusses the harm caused when disabled bodies, specifically bodies who stutter or do not speak efficiently, come into conflict with chronopolitics. These bodies are punished for their slowness and inability to meet the fast pace necessitated by chronopolitics. These functions of chronopolitics within academia align with neoliberal goals. Chronopolitics is the clock time that disabled bodyminds fail to adhere to, whose authority crip time pushes back against. Functioning within higher education, conferences are impacted by chronopolitics. When we are late to sessions because of our disability, we are punished for our lateness. The pressure to attend conferences to advance our careers is caused by the need to adhere to this academic time.
Many scholars have argued that academia is rooted in a tyrannical neoliberalism from which we must liberate ourselves for a more just and inclusive culture. The same neoliberal values can be found in librarianship and library work (Brady, 2023). Neoliberalism’s conception of the ideal worker is defined by the rigorous expectations for publishing and near-constant production.
This notion of the ideal worker sets the standard for accountability and performance in neoliberal higher education [and librarianship]. For example, performance-based research funding and reward systems now determine the kind of research projects, academic units, and professional behaviors that are valued (Vázquez & Levin, 2022, paragraph 3).
The endless grind of both librarianship and academia in the battle for higher prestige, doing more with less, constant production, and what Beretz calls the “academic culture of heroic stamina” (2003, p. 52), which ignores the mental and physical consequences, especially for marginalized folks that already face additional barriers to their participation. Vázquez and Levin (2022) argue that neoliberalism results in symbolic violence stemming from the “managerial practices” of neoliberalism. We use crip theory as a framework and materialist critique to examine the ways both librarianship and academia, which are steeped in neoliberal values of productivity, composes disabled bodies. Despite moves in librarianship and academia to be more inclusive and focus on diverse hiring, a perception remains that physically disabled people are unable to produce adequately, and production is the ultimate neoliberal goal. Even when disabled academics are able to conform to academia’s rigorous expectations for production, they are still punished or experience adverse long-term effects on productivity, morale, job satisfaction, physical or mental health (Pionke, 2019). Tenure and advancement within academia are not based solely on merit but on personality politics of who is seen to “fit” within academic culture, which often excludes any disabled people. Neoliberalism is firmly rooted in white supremacy. We acknowledge that with our whiteness, we are complicit in neoliberalism and the harm it causes.
Mental load is often used to describe the invisible mental work that is done by women in the household to continually keep things running, involving things like planning, anticipating, and organizing (Emma, 2018). We take this concept of invisible mental work and apply it to disability, highlighting the extra planning, considerations, and thoughts that disabled people have to navigate a conference setting. For example,
Pre conference:
Conference:
Post conference:
The above sketch is just a fraction of the thoughts that continually bombard my consciousness as I am getting ready for, going to, and following a conference. Typically, I am good at silencing racing thoughts, but during a conference these are ongoing. When a space is clearly not created for you, it takes more energy, advocacy, and creativity to navigate.
At a conference, I have the mentality of a marathon runner (something I could never do in real life): I am strategizing when to rest, when to spend, and those calculations need to be made ahead of time. I do not have the luxury of listening to my body in the moment and adjusting. If my body’s pain starts to get worse, my calculations are wrong and it is already too late. I am masking the inner turmoil these decisions cause, because I know if others could see my racing brain they would be uncomfortable, so instead I hold this discomfort for them. With my mental calculations, I try to predict the future, for my body, career, and connections with others. Ultimately, it is a zero-sum game, even when I win in one of those three areas, I lose in others.
Even if the chairs are too small and add more pain to my already pain riddled body, I do not have the luxury of opting out. If I want a career in academia, if I want people to listen to and respect my voice, if I want to be able to continue to pay for the things that keep my body moving such as medication, physical therapy, massage, and more, conferences are required regardless of the toll and damage it does. So I push. Having this mental burden of continually making these decisions makes me feel isolated and distinctly other. This feeling of otherness and the reluctance from academia and librarianship to embrace and accept folks with disabilities (Pionke, 2023) is why I force my way in and make space for myself and others like me. As Beretz highlights, “Illness and injury, after all, are inescapable realities of human life” (2003, p. 51). Since illness is a part of life, I reject the medicalized view that disabled folks are damaged, and demand acceptance for my body now, especially since it will never be healed (Isaacs 2020). We have every right to be able to pursue our passions, ideas, curiosities, and voice just like our nondisabled peers. And I will claim that space for myself, and others, even if it exhausts me to my core. Accessibility is not the responsibility of the individual; it is the responsibility of the system (Manwiller 2021).
The concept of a disability tax refers to the fact that it is often more expensive to move through the world in a disabled body (Olsen et al., 2022). In our disabled bodies, conference attendance is significantly more expensive for us than it is for many of our peers.
When I travel through the airport, I need to tip the workers who push my wheelchair. I sometimes travel with an assistive device, which is cumbersome and expensive. The difficulty of moving through the airport means I need to pay to check a bag, as it would be too taxing to navigate me, my rollator, and a carry-on bag. I am denied the privilege of a free carry-on bag if I decide to travel with my assistive device. I cannot take public transportation while carrying a heavy bag and, therefore, need to take a taxi or Uber to and from the airport. I need to stay at the conference hotel to rest between sessions and reduce commute time to the conference. It is hard for me to share a room with peers, as during conferences, I often fall asleep by 7 p.m. to rest up for the next day. Many of my friends will stay up to forty minutes away from the conference in a cheaper area, as a group, and take public transportation to attend the conference. None of these cost-saving measures are possible for me. I need to be able to rest between sessions. I also must arrive the day before the conference and pay for that extra night. It is impossible for me to take an early morning flight and attend conference sessions on the same day. Paying for an extra night before and after the conference is challenging on a graduate student’s budget. Attending conferences as a disabled person is costly for me, both the cost to my health from exerting myself and the financial cost of funding a conference trip in a disabled body.
Despite these challenges, I continue to attend conferences because they are required to succeed as an academic. I need a venue to share my work, to network with colleagues, and yes, get lines for my CV. Some days, it all feels like too much, and I wonder if academia is the best place for me. But I feel a stubbornness and a refusal to be forced out. I know that my scholarly work matters and that our field needs people who are pushing back in the ways I am. If I were to give up, that would be one less disabled voice holding academia to task and pushing for a more inclusive and accessible culture. And also, I do this work because I can. While there is a financial and physical toll on me, I am able to succeed in academia despite what it costs me. I have periods of relative health when I am able to work the long hours required for this field. I have the kind of brain that allows me to work on projects weeks ahead of their deadlines so that even if I need to take time off when I’m ill, I can mostly complete my work on time. Most of the time, unless I am using my mobility aid, I am invisibly disabled. While there is pain that comes with not being seen, there is also a privilege in being seen as nondisabled and receiving the accompanying advancements. I am white and do not experience additional barriers in academia because of my race.
It is likely that while I experience some disadvantages because of my disability, gender, and queerness, I mostly benefit from academia’s exclusionary politics. I do this work for those who can’t or won’t, so disabled voices are still heard in academia. I do this work because the cost I pay is so much less than the cost many people experience. And also because I love it. I love the challenge and living in a world of ideas. I love that I spend my days writing and thinking, and I love that I can do this work with my friends. So even when it’s hard, and even when there are days I want to give up, I know I won’t. I’m in it for the long haul.
After conferences, I often spend up to a week in bed recovering and am unable to work. I become anxious about missing deadlines, and I worry I won’t ever feel any better. It is hard to explain to my professors and peers why I need so much rest after conferences, when they are able to return to work the day after traveling back from a conference. I feel frustrated by the toll the conferences take on me and my need to rest. Even those who know about my disability don’t understand what it’s like to exist in my body. It is hard to take the break I need in an academic culture where there is no expectation that rest might be needed. I find myself doubting my chosen career and wondering if I can make it through another conference. Kafer writes, “we are all to be smoothly running engines and disability renders us defective products” (2013, p. 54). When I can’t get out of bed for a week and deadlines pile up, I am made to feel like a defective product, a far cry from academia’s ideal worker.
This conflict between my body and the expectations of my profession can be explained through a consideration of crip time and chronopolitics. Conferences, libraries, and larger academic institutions have a hyper-focus on clock time, governed by chronopolitics. The conference clock time is an impossibility in the crip time chronically ill people exist in. Sleeping in and going to bed early to preserve energy forces us to miss sessions and opportunities. Waiting for the one accessible bathroom or the time it takes to navigate large conference halls makes us late to sessions. We are seen as failing when we cannot conform to the demands of conference tine. And after the conference is over, the conflict between clock time and crip time only worsens. The penalties of conference attendance and forced conformity to clock time force us into bed and even further out of time. In the autoethnography of her stroke, Jane Speedy (2015) writes about the way time collapses when one is ill and in bed. For Speedy, illness calls into question one’s “situatedness” (p. 37). Time slows to a snail’s pace as we spend an hour staring at the wall or the inside of our eyelids. And time becomes fast, jumping forward when we sleep for hours and wake to find the sky already dark. Kafer (2013) writes that illness and disability impact how one experiences time: “Not only might they cause time to slow, or to be experienced in quick bursts, they can lead to feelings of asynchronicity or temporal dissonance” (p. 34). This disabled experience of time forces a departure from clock time or what Kafer calls straight time with its insistence on “firm delineation between past/present/future” (p. 34).
Carework is an idea coined by disabled writer and activist Leah Lakshmi Piepzna-Samarasinha (2018) to frame the work done by and within disabled communities of color as acts of love rather than chores or obligations. Distancing themself from traditional ideas of caring for disabled people as a burden, Piepzna-Samarasinha argues that caring for one another is a way to build power and foster communities where no one is left behind. We draw from the work of Black and brown queer femmes who centers the importance of care and connection to use the idea of carework to offer new possibilities for both librarianship and academia, freed from neoliberal preoccupations with productivity that exclude disabled people. By making crip connections, we can create care networks that oppose typical productivity, embrace rest, and lead with care (Hersey, 2022). We offer care not because we feel obligated but because it is a radical way we can show our love for one another.
While conferences have many consequences for chronically ill people, they also offer moments of connection. We the authors, Leah and Rhys, met at a conference in 2022, and the friendship that has blossomed offers us both support and solidarity to persist in an ableist and often hostile climate. Crip community makes withstanding the hurdles of conference attendance more possible as it offers togetherness and knowledge that one is not alone. Living across the country from one another, we only see each other in person when we travel to the same conference. While conferences are hard on our body, the opportunity for us to connect is a lifeline for us within academia. Upon our first meeting, we processed our grief and rage over a lack of COVID precautions over text.
Leah: The last thing I wanted to say is how this conference has COVID policy in place, but they could submit a negative test like two weeks ago? How does that help? Okay, thank you for letting me get this out lol.
Rhys: It’s so ridiculous. People want a checkbox that they’ve done the right thing but are unwilling to take steps to keep people safe. Which I know is the same with literally everything else in our society but it’s heartbreaking watching it unfold when it could be different.
Leah: Agreed. And when it’s something so small as a mask.
This conference was held in 2022. By 2024, almost all conferences have dropped all pretense of COVID precautions despite the reality that at the time of this writing, COVID is still a concern for us as disabled people: we don’t need to get another disability like Long COVID; this is not Pokémon, we do not need to catch them all. Our friendship allows us space to express our deep rage and feel less isolated in our experience. Malatino writes of an “infrapolitical ethics of care” which he calls “a reliance on a community of friends to protect and defend one from violence, to witness and mirror each other’s rage, in empathy, and to support one another during and after the breaking that accompanies rage” (2022, p. 118). We offer each other the refuge to express and mirror our anger.
Participating in conferences together also offers the validation of the experiences and the struggles that accompany attendance, which makes conferences a bit more bearable. It is difficult to explain to nondisabled people what it’s like to exist in our bodies and how impossible it can feel. Rhys was encouraged by a professor to attend a doctoral poster session held after a long day of sessions. Rhys made it out of their hotel room and to the event hall but was immediately overwhelmed by the mass of unmasked people, the noise, and the lack of chairs and retreated to bed. They texted Leah, “I was feeling too shitty to go to the posters, so I ate pizza in bed.” Leah validated this decision to prioritize their health and rest in bed rather than pushing through, which would lead to an even bigger collapse later on. As we parted ways at the end of the conference, we both expressed our joy at meeting one another and promised to stay in touch.
Leah: Please stay in touch, Always happy for rants, successes, and just general life being a disabled grad student. I’m really glad we connected.
Rhys: I am too! You’re my first disabled PhD person so I will definitely take you up on that.
In the two years that followed, Leah and Rhys text each other regularly for support and to celebrate accomplishments. And our friendship has evolved into a professional partnership with the writing of this and other autoethnographies examining our experiences. Our friendship offers support that is unavailable to us through traditional socialization within the university. Celebrating our successes together allows us to create an independent rewards system to sustain our energy in a culture where rewards are withheld and often portioned out in racist, sexist, heterosexist, and ableist ways (Museus & LePeau, 2020).
Within our friendship, we are also creating a small culture of care in academia and librarianship. These systems of power do not love us back, we all need and give care along with the time, support, and resources to care well (Segal, 2023). By caring for each other, and embracing crip time, we create pockets of space and time for care where none exists. We reject the neoliberalism of capitalism together by caring for and validating each other. And this is why we continue to do the work and go to conferences. When we go to conferences and bring carework with us, we are disrupting both the conference and academic systems by creating belonging where there was none. We also go to conferences because it is when we are able to see each other and maintain our relationship which has proven to be an important lifeline in both our personal and professional lives. We also talk about things outside of our academic experience, expanding our two person care network outward and further into the liminal crip time. Capitalistic systems are not built with care, and by creating systems of care, we further push against capitalistic expectations of librarianship and academia (Hersey, 2022; Segal, 2023; Kafai, 2021).
This article is just one small conversation within a larger discussion of how to disentangle libraries and academia from their insistence on production and value of the ideal worker. While making conferences more accessible is a place to begin, until the pace of librarianship and academia changes chronically ill folks will face the same ruptures of crip time and clock time that impede our success. In our writing, we turn to the future and what might be by using autoethnography, enabling us to attune to queer and crip futurity. Muñoz (2009) conceptualized queer futurities as “what’s not yet here”; these futures “insist on another time and place that is simultaneously not here yet but also to be glimpsed in our horizon (p. 183). Inspired by Muñoz, we use our stories to reflect on what has been and ponder what could be. In this crip imagining, we speak into being the futures we want for ourselves. We manifest futures where rest is prioritized, where crip knowledge is valued and disabled people share with our well and nondisabled peers the lessons about wellbeing and community care. We aspire to a future where burnout is not glorified, where academics do not compete over who can work the most, where diverse work paces are considered for promotion. We imagine a future for library and information science where crip methodologies are esteemed as rigorous and given space in leading journals.
We would like to thank the Internal Peer reviewer – Brea McQueen ; External Peer reviewer – JJ Pionke; and Editorial Board member Jess Schomberg for their kindness and help with this work. We would also like to honor our crip connection as authors for allowing this co-written scholarship and thank our bodyminds for producing this work.
Anderson, N. (2024). Chronically honest: An autoethnographic paper on the experiences of a disabled librarian. In the Library with the Lead Pipe. https://www.inthelibrarywiththeleadpipe.org/2024/chronically-honest/
Becker, T. (2019). Chronopolitics: Time of politics, politics of time, politicized time. History and Theory, 62(4), 3-23. https://www.hsozkult.de/event/id/event-89282
Beretz, E. M. (2003). Hidden disability and an academic career. Academe, 89(4), 50–55. https://doi.org/10.2307/40252496
Brady, F. (2023). Scaffolded information literacy curriculum: Slow librarianship as a rejection of the hegemony of neoliberalism. Journal of New Librarianship, 8(2), 29–40. https://doi.org/10.33011/newlibs/14/2
Clare, E. (2017). Brilliant imperfection: Grappling with cure. Duke University Press.
Ellis, C. (2004). The ethnographic I: A methodological novel about autoethnography. AltaMira Press.
Hamraie, A. (2017). Building access: Universal design and the politics of disability. University of Minnesota Press.
Hersey, T. (2022). Rest Is resistance: A manifesto. Little, Brown.
Holman Jones, S., & Harris, A. M. (2019). Queering autoethnography. Routledge.
Isaacs, D. (2020). ‘I don’t have time for this’: Stuttering and the politics of university time. Scandinavian Journal of Disability Research, 22(1). https://doi.org/10.16993/sjdr.601
Kafai, S. (2021). Crip kinship: The disability justice & art activism of Sins Invalid. Arsenal Pulp Press.
Kafer, A. (2013). Feminist, Queer, Crip. Indiana University Press.
Kasnitz, D. (2020). The politics of disability performativity: An autoethnography. Current Anthropology, 61(S21). https://www.journals.uchicago.edu/doi/full/10.1086/705782
Kuppers, P. (2014). Crip time. Tikkun, 29(4). https://muse.jhu.edu/article/558118/pdf
Malatino, H. (2018). Tough breaks: Trans rage and the cultivation of resilience. Hypatia, 34(1), 121-140. https://transreads.org/tough-breaks-trans-rage-and-the-cultivation-of-resistance/
Manwiller, K. Q. (2021, May 26). The inaccessibility of ACRL 2021. ACRLog. https://acrlog.org/2021/05/26/the-inaccessibility-of-acrl-2021
Manwiller, K. Q. (2019, October 27). Conferencing while chronically ill. ACRLog. https://acrlog.org/2019/10/27/conferencing-while-chronically-ill
Muñoz, J. E. (2009). Cruising utopia: The then and there of queer futurity. NYU Press.
Museus, S. D., & LePeau, L. A. (2020). Navigating neoliberal organizational cultures implications for higher education leaders advancing social justice agendas. In A. J. Kezar & J. R. Posselt (Eds.), Higher education administration for social justice and equity: Critical perspectives for leadership. Routledge. https://works.bepress.com/samuel_museus/113/
Olsen, S. H., Cork, S., Anders, P., Padrón, R., Peterson, A., Strausser, A., & Jaeger, P. T. (2022). The disability tax and the accessibility tax. Including Disability, 1(51), 51-86. https://ojs.scholarsportal.info/ontariotechu/index.php/id/article/view/170
Piepzna-Samarasinha, L. L. (2018). Care work: Dreaming disability justice. Arsenal Pulp Press.
Pionke, J. (2023). The interview process and people with disabilities. Journal of Library Administration, 63(4), 587–593. https://doi.org/10.1080/01930826.2023.2201724
Pionke, J. J. (2019). The impact of disbelief: On being a library employee with a disability. Library Trends, 67(3), 423–435. https://doi.org/10.1353/lib.2019.0004
Price, M. (2011). Mad at school: Rhetorics of mental eisability and academic life. University of Michigan Press.
Richards, R. (2008). Writing the othered self: Autoethnography and the problem of objectification in writing about illness and disability. Qualitative Health Research, 18(12), 1717-1728. https://doi.org/10.1177/1049732308325866
Samuels, E. (2017). Six ways of looking at crip time. Disability Studies Quarterly, 37(3). https://dsq-sds.org/index.php/dsq/article/view/5824/4684
Segal, L. (2023). Lean on me: A politics of radical care. Verso Books.
Speedy, J. (2015). Staring at the park: A poetic autoethnographic inquiry. Left Coast Press.
Vázquez, E., & Levin, J. (2018). The tyranny of neoliberalism in the American academic profession. Academe: Magazine of the American Association of University Professors. https://www.aaup.org/article/tyranny-neoliberalism-american-academic-profession
Zembylas, M. (2023). Time-as-affect in neoliberal academy: Theorizing chronopolitics as affective milieus in higher education. Studies in Higher Education, 49(3), 493–504. https://doi.org/10.1080/03075079.2023.2240352
[1] Other marginalized groups such as people of color and queer and trans people, and people in the Global South also have complicated relationships with clock time and often fail to operate on the schedule of normative time. By normative time, we mean a time system governed by Western, white, non-disabled, cisgender, and heterosexual society.
Discover how to transform your ecommerce search architecture for 2025. Learn proven strategies for implementing AI-powered search, real-time inventory awareness, and personalization from a search engineer.
The post Beyond Basic Search: How to Clean Up Your Ecommerce Search in 2025 appeared first on Lucidworks.
The following post is one in a regular series on issues of Inclusion, Diversity, Equity, and Accessibility, compiled by a team of OCLC contributors.
The International Federation of Library Associations and Institutions (IFLA) recently published the IFLA Guidelines for Libraries Supporting Displaced Persons: Refugees, Migrants, Immigrants, Asylum seekers to provide practical guidance for libraries supporting these groups. The guidelines define displaced persons as “persons who have been forced or obliged to flee or leave their homes or places of habitual residence (whether in their own country or across an international border), in particular as a result of or in order to avoid the effects of armed conflict, situations of generalised violence, and human rights violations or natural/human-made disasters. In the context of these guidelines, we refer as a whole to all these different groups: asylum seekers, immigrants, migrants, and refugees.” The guidelines cover services and programs for users, policies, staff training, and other topics. As these are broadly written for international use, libraries would use the guidelines as a starting point in the formation of their own policies and practices.
There is clearly a need for this type of publication as there were over 117 million people forcibly displaced in 2023. There are many helpful recommendations including offering a working space to humanitarian organizations inside the library and creating pop-up library spots inside refugee camps and asylum centers. I wish the authors had explicitly acknowledged the differences between voluntary migration and displacement which is an involuntary migration caused by horrible conditions in the place people are leaving. While cultural and language differences may impact many immigrants and migrants, those who have been forcibly displaced are more likely to have additional disadvantages and special needs because of their displacement. Libraries are better able to serve displaced persons when staff understand the differences between these migration situations. I am guessing the authors of the publication chose to use the term “displaced persons” because it is not a legal term and many who do not legally qualify as refugees may have fled violence or extreme poverty and suffered terribly. I believe libraries will find this publication useful if they are mindful of these situational differences when reading the guidelines and reviewing resources cited in the bibliography. Contributed by Kate James.
All of us who treasure libraries and value the roles they play in a free and democratic society have been wondering how to prepare for the 119th United States Congress and the 47th President of the United States. On 15 January 2025 at 4:30 p.m. Eastern Time, the American Library Association Public Policy and Advocacy Office will offer “Standing Up for Libraries: The Next 100 Days,” a webinar that is free for all ALA members. Although attendance is limited to 1000, a recording of the session will be available to ALA members through 30 January. ALA promises to “offer tangible steps for library advocates moving forward and preview upcoming legislation and litigation that will impact the library field.”
Since 1945, the Public Policy and Advocacy Office has been the voice of libraries speaking to the government of the United States and keeping libraries and their advocates informed about government policies and actions. The office has been instrumental in furthering the interests of libraries and users in the realms of privacy, funding, copyright, government information, education, and related areas. Keeping informed and promoting library values remains as important as ever. Contributed by Jay Weitz.
Binghamton University Libraries was honored with the 2024 South Central Regional Library Council Prism Award, which honors library workers or organizations for work in advancing for Diversity, Equity, Inclusion, Justice and Accessibility. This work includes implementing structural changes, actively becoming antiracist or reimagining policies to be inclusive. Binghamton Libraries have sustained several initiatives, ranging from fostering the library as a safe and inclusive space and place to diversifying collections.
Binghamton University was one of a number of OCLC Research Library Partnership institutions we interviewed in order to better understand how research libraries are approaching diversifying collections. It is great to see their work—which has been ongoing for some time—acknowledged in this way. Contributed by Merrilee Proffitt.
The post Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 7 January 2025 appeared first on Hanging Together.
Air pollution is one of the world’s greatest environmental threats to public health. In Asunción, Paraguay, where pollution levels are increasing year after year, there is an urgent need for initiatives to tackle this problem. Project Respira, funded by the Mozilla Foundation as part of the Open-Source AI for Environmental Justice Award 2024, is presented as the first air quality prediction service for the city of Asunción and its surroundings. This initiative is an example of how free technology, open data and collaboration can provide innovative solutions to critical problems related to climate change and environmental justice.
Project Respira is a pioneering initiative that aims to provide air quality forecasts for the Greater Asunción area. The project was designed to empower citizens by providing not only forecasts, but also health recommendations tailored to the local context, allowing them to make informed decisions to protect their health on days when air quality is compromised. From planning outdoor activities to avoiding exposure during periods of high pollution, this service represents a crucial step towards reducing the health risks associated with pollution through open information and citizen education.
The system includes a web platform and integrated bots in Telegram and X (Twitter) for easy access to the services. Features include daily air quality forecasts, guidance on how to protect your health based on pollution levels, statistics generated from sensors distributed across the city, and access to relevant research. In addition, the project’s open repositories on GitHub allow others to replicate and adapt the system in different contexts.
Climate change has exacerbated environmental problems in Paraguay, creating increasingly urgent challenges. Pollution levels have increased in recent years, driven by forest fires, extreme weather events and urban sprawl. One example was in September 2024, when a thick layer of smoke covered the country for weeks as a result of fires in northern Paraguay, Bolivia and Brazil. This phenomenon caused respiratory problems for thousands of people and reignited the debate on the need for tools to anticipate and mitigate the risks associated with pollution.
Against this backdrop, Project Respira emerges as an effective and practical response, developed by the community and for the community. This system not only provides a predictive tool for air quality, but also reflects how citizen organisation and open technology can be used to address the challenges of climate change.
The initiative focuses on the principles of environmental justice, ensuring that benefits reach the most vulnerable. At the same time, it builds the capacity of communities and authorities to respond to environmental emergencies. Phenomena such as forest fires and air pollution, which are becoming increasingly common in South America, underscore the importance of projects like this to protect public health and build resilient communities in the face of climate challenges.
Project Respira is the result of an important collaboration between civil society, academia and the open source community in Paraguay. The Faculty of Engineering of the National University of Asunción has been a key pillar, providing not only infrastructure and technical knowledge, but also access to open data essential for the operation of the system. A crucial aspect of this effort has been the active participation of the Paraguayan open source community, especially through the organisation Girls Code Paraguay, an initiative led by women in technology that has been instrumental in the development and implementation of innovative solutions. Their commitment and experience have made the project not only viable, but also replicable and scalable in different contexts.
The design of the system follows open technology principles, meaning that its components are adaptable and can be implemented by other regions, both within Paraguay and in other countries. Although in its early stages the project is focused on the Greater Asunción area, it is designed to provide regional forecasts in the near future, expanding its scope and benefiting more communities.
The success of Project Respira is largely due to access to open data and the use of advanced Machine Learning models. Predicting pollution levels requires not only pollutant measurements, but also accurate meteorological data to help model the dynamics of pollutant movement through the city. Pollution level measurements from the PM2.5 Particulate Matter Monitoring Network of the Faculty of Engineering of the National University of Asunción are combined with climate data from Meteostat, a global service that provides historical and current meteorological information.
A second innovative aspect of the project lies in the development of strategies to improve the quality of data from field instruments. A major challenge in collecting pollution and climate data in the Latin American context is the need for regular calibration of measuring instruments, which lose reliability over time due to wear and tear on the instruments. This leads to operating and maintenance costs that are difficult to sustain over time.
Project Respira, in collaboration with researchers from the National University of Asunción, is using an innovative remote calibration system for its low-cost sensors, using data from the US Embassy’s AirNow system in Paraguay. Remote calibration is a solution that significantly improves the reliability of the data collected by the sensors. This solution allows low-cost sensors to be installed at numerous points around the city and remotely calibrated on a regular basis using a single high-end sensor as the standard, significantly reducing operation and maintenance costs. This is critical in contexts where resources are limited, such as in many Latin American cities. The approach is also replicable, allowing other regions to implement similar solutions tailored to their needs.
Although Project Respira is in its early stages and is currently limited to the greater Asunción area, its modular and open design allows for a future in which this service can be expanded throughout Paraguay and eventually replicated in other regions of the world. This approach demonstrates the potential of collaboration between academia, civil society and the open source community as an effective model for addressing complex environmental problems.
The project also highlights the importance of open data in the fight against climate change. Thanks to the availability of high-quality data from a variety of sources, it is possible to build predictive models that are not only effective, but also accessible and easily reproducible. This approach reinforces the need to promote open data policies that facilitate innovation and allow other similar initiatives to develop, adapted to the local specificities of each region.
In a world increasingly affected by climate change, initiatives like Project Breathe play a critical role in protecting public health, empowering communities and building a more resilient future. By combining advanced technology, open data and interdisciplinary collaboration, this project sets a precedent for how environmental challenges can be tackled now and in the future.
Happy New Year! As we begin 2025, we wanted to take a moment to look back at what we’ve done over the past year. Please have a look at NDSA’s accomplishments – and feel free to reach out to NDSA with any questions on how you can get involved!
NDSA Leadership has achieved significant milestones this year, with a focus on refining our mission and vision and growing NDSA into a more fully-fledged professional organization in response to members’ stated support needs.
In January 2024, NDSA Leadership released an RFI seeking expressions of interest from potential new host organizations. Over the course of the RFI process, we have been exploring alignment with other organizations, assessing their suitability as long-term hosts – and clarifying what NDSA can bring to the table for its host organization. To that end, in June 2024, we updated NDSA’s foundational strategy and outlined specific activities and initiatives to be completed in the next 3-5 years that will strengthen and stabilize NDSA’s shared governance, enhance membership services and outreach, and increase transparency through new communication strategies.
After the exploratory conversations kicked off through the RFI, in June we established a recurring monthly meeting with our prospective new host, with the goal of establishing a sustainable funding model and business plan. Together, we developed and conducted a member fee survey to assess members’ willingness and ability to sustain the organization through various types of financial contributions. Preliminary analysis of the survey data supports a fee-for-membership model as one element of a set of diverse funding sources, and underlines NDSA’s need for financial resilience. We hope to release a fuller report documenting the survey findings in the coming months.
Concurrently with the member fee survey, we also submitted a preliminary proposal for an IMLS planning grant in September. If awarded, the grant funding would allow NDSA to hire a part-time program coordinator who will assist with developing a business plan and transitioning NDSA to a fully self-funded model. We plan to begin drafting the full grant proposal soon, with anticipated milestones stretching into 2025.
Finally, NDSA Leadership is currently working on updated public governance documentation, integrating updates to bylaws and clarifying roles within the Coordinating Committee (CC). On the recommendation of the Membership Working Group, which suggested several improvements to the NDSA member experience, we have decided to add a CC Secretary role to oversee critical records, and we are establishing a Standing Membership Group that will focus on approving and onboarding new members – and supporting all NDSA members through enhanced outreach. Look for announcements about these efforts coming soon!
NDSA welcomed eight new members in 2024, representing a diverse mix of academic, commercial, and nonprofit organizations:
As existing members, the new year is a good time to make sure your organization’s contact information is up to date. A simple form is available to assist with this process.
Stay tuned for a recap of the activities and accomplishments that our Interest Groups and Working Groups achieved in 2024!
The post NDSA Leadership 2024 Year in Review appeared first on DLF.
‘Open Data Workshop’ took place on Thursday, December 12, 2024, at the new conference center of the Region of Central Macedonia. The event, organised by the Digital Governance Sub-region, focused on the values of open data in modern societies, aiming to promote transparency, innovation, and sustainable development.
Τhe event featured panels and discussions on using public and open data to create digital tools that address contemporary challenges for the benefit of citizens. Charalampos Bratsas, President of OKFN Greece and Assistant Professor at the International Hellenic University (IHU), coordinated the panels, highlighting the importance of human oversight in data management and the need for enhanced transparency and interoperability in the digital transformation of both the public and private sectors. Additionally, the workshop opened with greetings from representatives of the public sector, such as Nikos Tzollas, Deputy Regional Governor for Digital Governance. Afterwards, the invited speakers, among which Kostas Gioulekas, Deputy Minister of the Interior (sector of Macedonia and Thrace), Kostas Vassilopoulos, Deputy Mayor for Digital Policy and E-Governance, Athanasios Thanopoulos, President of ELSTAT, Theophilos Mylonas, President of SETPE, etc., discussed on issues related to the values of open data in contemporary societies through different discussion panels, including “Leveraging Open Data to Create Tools for the Benefit of Citizens”, “Open Data: Infrastructure – Challenges” and “Availability of Open Data”.
OKFN Greece also hosted a dedicated panel, titled “Workshop: Open Data Repositories”, where Lazaros Ioannidis, researcher at OKFN Greece and PhD candidate at IHU, presented the digital platform developed within the UPCAST Project to host an open data marketplace. Additionally, the launch of the ‘Open Up Thessaloniki Climate 2025’ competition was announced, inviting participants to register or contribute data to develop the competition’s open data platform.
Overall, the ‘Open Data Workshop’ was an important step towards raising awareness about open data and the challenges that arise from its implementation in society and the public sector. The contribution of OKFN Greece to the success of this event was crucial, setting the tone for future developments in the field of digital governance and the use of open data to address contemporary challenges.
The following post is part of a series that documents findings from the RLP leadership roundtable discussions.
Research libraries are experiencing increasing demand for research support services, such as open research and data management, research analytics, and systematic reviews, often in collaboration with other campus partners. This presents significant challenges for effectively resourcing and scaling these services.
The OCLC Research Library Partnership (RLP) convened the Research Support Leadership Roundtable in October 2024 to discuss how libraries are making both incremental and large-scale changes to scale and resource their research support services.
The roundtable included 45 participants from 37 institutions in four countries, who engaged in four separate discussions focused on the evolving landscape of research support:
Binghamton University | Smithsonian Institution | University of Nevada, Reno |
British Library | Stony Brook University | University of Pittsburgh |
Carnegie Mellon University | Syracuse University | University of Southern California |
Clemson University | Temple University | University of Sydney |
Colorado State University | Tufts University | University of Tennessee, Knoxville |
George Washington University | University of Calgary | University of Texas at Austin |
Getty Research Institute | University of California, Riverside | University of Toronto |
Hofstra University | University of California, San Diego | University of Utah |
Institute for Advanced Study | University of Delaware | University of Warwick |
Monash University | University of Glasgow | Vanderbilt University |
Montana State University | University of Illinois Urbana-Champaign | Virginia Tech |
Ohio State University | University of Leeds | |
Rutgers University | University of Minnesota |
Our conversations focused on library organization and staffing, and participants were asked to consider these framing questions:
This post offers a synthesis of our discussions. RLP leadership roundtables observe the Chatham House Rule; no specific comments are attributed to any individual or institution.
Resourcing research support services at an adequate level is a universal pain point among RLP libraries. What differs is how libraries have organized their staffing and services to meet these needs, and we heard from RLP libraries that are structured across a spectrum of organizational configurations.
At one end of that spectrum are libraries that rely primarily upon a decentralized cadre of subject liaisons to deliver research support. Liaisons provide a direct contact for users, roughly in parallel with the academic organization of the university. While most RLP institutions participating in the discussion rely on liaison librarians for some degree of research support, only a couple of institutions reported relying primarily upon liaisons to provide research support. And these libraries anticipate reconfiguration toward a mixed model as vacancies occur.
At the other end of the continuum are libraries that deploy a centralized functional model. Here, service-oriented teams manage library tasks—such as research data curation, copyright consulting, or collection development—across all disciplines, rather than assigning multiple responsibilities to individuals within a single subject area. Duane Wilson notes in a recent historical literature review that since 2011, more libraries have moved to this model, with librarians focusing on specialized functions, such as collection development, scholarly communication, and research impact.[1] Indeed, 6 of 36 university libraries in the US, UK, and Australia participating in the roundtable have shifted to a functional model.
Most libraries participating in the discussion, however, fall somewhere in the middle of the spectrum, with approximately two-thirds of institutions deploying a combination of these strategies. Library scholar Sheila Corrall has called this a “mixed structure,” with a combination of functional librarians supporting services like scholarly communications and data curation, and liaison libraries supporting one or more disciplinary areas.[2] The growth of specialized research support services—such as scholarly communication, data management, and research impact—has further driven this shift, creating increasingly mixed or matrixed approaches to service delivery.
Roundtable discussions revealed that many RLP libraries are experimenting with organizational structures to increase capacity for research support. While a few institutions have eliminated legacy liaison roles altogether, most are being “reorganised around the edges instead of completely discarding their old structure and beginning anew.”[3]
Many RLP libraries are experimenting with organizational structures to increase capacity for research support
Overall, roundtable participants expressed differing opinions about their continued use of a mixed organizational structure heavily reliant upon distributed subject liaisons. Many value how the liaison model supports collections-focused research and personalized support to faculty and students. But others expressed frustration with “historical positions” offering bespoke services that are neither scalable nor strategic.
One public US university library described its research support services as having developed in an ad hoc manner, resulting in a mixed structure. While this decentralized environment has encouraged innovation and experimentation, the participant noted it was “neither coordinated nor strategic.” Recognizing the unsustainability of this arrangement, the library is reorganizing, with the intention of better addressing under-supported areas like research services.
To scale within the existing liaison framework, several libraries are experimenting with team-based approaches. A private US institution, for example, organized its subject liaisons into functional teams like research impact and data services, but with mixed results. While each librarian has deepened their expertise by focusing on a specific functional service area, librarians struggled to do “double duty,” balancing functional responsibilities with subject expertise and networks—a challenge worsened by shrinking professional development budgets. Another public US institution tried a similar staffing configuration but found it unsuccessful, ultimately reorganizing liaison librarians into fully functional roles. Two other institutions are currently testing similar strategies.
Several participants expressed disappointment with their library’s inability to scale research support services using the mixed model but see near term change as unlikely, due to a “lack of political will.” But another participant sees this matrixed structure “mostly working” to scale research support, as subject liaisons support a “middle zone” of service in an area like research data management, reserving the most specialized work for the dedicated RDM librarians.
Peppered throughout our roundtable discussions were many comments about the benefits and challenges of the mixed organizational structure. Often, things that many perceived as benefits simultaneously present challenges.
While most RLP libraries participating in this roundtable rely on a mixed structure, a few have transitioned to a functional or service-oriented model. These shifts, driven primarily by the need for greater scalability and strategic alignment with institutional priorities, can feel radical for both librarians and users.
One UK institution adopted a functional model several years ago, driven by the need to create capacity for open access support. The change has been successful, providing better support to users, and with the additional benefit of helping library staff and services “feel more embedded in the university.” The library plays a larger role on campus and is now a part of institutional strategy and planning conversations.
In the US, a public university also shifted to a functional model, reorganizing subject liaisons into two teams: student success (serving undergraduates) and research support and open scholarship (targeting faculty, researchers, and graduate students). The previous subject liaison model was seen as unsustainable.
After a long period of consultation, another university library is transitioning from a liaison model which delivered quality one-on-one service to researchers but lacked scalability and agility. Seeking greater research support capacity, the library redistributed education, engagement, and research responsibilities across three functional teams. The research services team will be further subdivided into research impact and publishing support.
One participant described research libraries as being at a significant moment of change, as traditional liaison models—centered on collection development, information literacy, and reference support—are less effective as research support demands increase. Collection development work is also increasingly centralized. [5] To scale services for one large research university with more than 80,000 students and nearly 20,000 faculty and staff, the move to a functional model should support more agile, scalable service delivery, in response to institutional needs. The institution is implementing a tiered approach: ideally, 80% of support will be delivered via self-service access by users, followed by small-group workshops and, lastly, specialized high-touch support deployed strategically—not as the default.
Most institutions that shifted to a functional model from a mixed structure described a fairly rapid transition, following extensive study and consultation. However, one public US institution made a gradual, decade-long transition from the liaison model, primarily by reallocating vacant liaison roles to more strategic functional roles in areas like research data management, scholarly communications, and teaching and learning.
Library reorganizations have significant impacts on workers, and roundtable participants described a gamut of responses from librarians during their reorganizations. While some librarians thrive, developing new skills and expertise, others struggle, grieving the loss of professional identities and fulfilling responsibilities. Re-skilling is also a challenge, as increased needs for professional development, training, and conference attendance often collide with institutional austerity measures.
A significant challenge reported by one public US institution was the loss of faculty relationships. Subject liaisons often attended departmental meetings and built deep connections. Structural changes can disrupt these relationships. Faculty members accustomed to contacting a familiar subject liaison may balk when asked to seek assistance through a general email address, and both users and librarians may quietly revert to using the former model.
Like the mixed organizational structure, the functional model has both benefits and challenges:
Reflecting on these roundtable discussions, I see an urgent need for libraries to evolve in a complex, rapidly changing environment. Research libraries, particularly those affiliated with prestigious research universities, must develop services that align with the institution’s research and teaching missions. Institutional complexity often slows change, especially when stakeholders are invested in established structures, relationships, and workflows.
Many libraries continue to leverage legacy service models developed in an earlier era—when collection development, information literacy, and reference support were primary needs. These models predate online catalogs, WorldCat, the internet, digitized resources, linked data, e-books, and AI. In our leadership roundtable discussions, participants expressed a desire to explore new organizational structures. Yet, many acknowledged that near-term changes remain unlikely due to steep switching costs—the costs of shifting from one approach to another, such as new organizational structures, workflows, technologies, and relationships. Transitioning to new models demands effort, planning, political capital, change management, and patience.
However, as my colleague Brian Lavoie has written about elsewhere, there are also status quo costs to consider. These arise from avoiding change and continuing existing practices. Roundtable participants delineated many switching costs in our discussion—things like limited capacity for research support, reduced visibility among stakeholders, uneven service provision, and difficulty strategically deploying resources to support institutional priorities. Switching costs may be high, but status quo costs may be higher, with potential risks of diminishing the library’s value, autonomy, and access to resources.
Switching costs may be high, but status quo costs may be higher.
There’s no silver bullet. No universal solution will work for every library. Tradeoffs are inevitable, and each library must consider its strategic priorities, resources, work climate, and overall business needs. My hope is that our roundtable discussions—and this synthesis—provide support to research libraries as they navigate change.
[1] Wilson, Duane. “Constant Change or Constantly the Same? A Historical Literature Review of the Subject Librarian Position.” College & Research Libraries 85, no. 7 (November 1, 2024): 1035. https://doi.org/10.5860/crl.85.7.1035.
[2] Corrall, Sheila. 2014. “Designing Libraries for Research Collaboration in the Network World: An Exploratory Study”. LIBER Quarterly: The Journal of the Association of European Research Libraries 24 (1): 17-48. https://doi.org/10.18352/lq.9525.
[3] Stueart, Robert D., and Barbara B. Moran. Library and Information Center Management. 7th ed. Westport, CT: Libraries Unlimited, 2007, 188.
[4] Wilson, 11.
[5] Wilson, 2.
The post Examining library structures to scale research support services: Insights from an OCLC RLP leadership roundtable appeared first on Hanging Together.
Mark your calendars! On 14 January, 2025, the Open Knowledge Foundation will host an engaging online event to dive deep into the intersection of technology and democracy. Together, we’ll reflect on the transformative Super Election Year 2024, exploring how technology shaped electoral processes worldwide and discussing the future we can build together.
Date: January 14th, 2025
Where: Online
Time: 14:00 UTC
Confirmed Speakers
Stay Tuned: More details on speakers and sessions coming soon.
This event is part of The Tech We Want initiative—our ambitious effort to reimagine how technology is built and used. We believe software should be useful, simple, long-lasting, and, most importantly, focused on solving real-world problems.
Elections are a cornerstone of democracy, but they are not immune to the challenges posed by rapid technological change. From digital voter registration systems to combating misinformation and ensuring secure, transparent electoral processes, the role of technology in 2024 elections was unprecedented.
This online gathering is a follow-up to our 2023 roundtable discussions on Digital Public Infrastructure for Electoral Processes, where experts from across the globe highlighted the need for inclusive, open, and reliable tech to support democratic practices. If you missed it, you can learn more about the initiative here.
The event will bring together policymakers, technologists, activists, and thought leaders to:
This is more than a conversation—it’s a call to action. Let’s ensure that the next generation of technology is built to empower citizens, uphold transparency, and strengthen democratic systems.
The discussions will directly contribute to a submission for the UN Special Rapporteur on Freedom of Expression’s 2025 Thematic Report on Freedom of Expression and Elections in the Digital Age. This is a unique opportunity to ensure that our collective vision for responsible and impactful technology influences global policies and strengthens the democratic process worldwide.
Whether you’re a developer, advocate, or simply curious about the intersection of technology and democracy, this event is for you. Let’s come together and shape The Tech We Want—tech that works for everyone.
For updates, follow the Open Knowledge Foundation on Mastodon, Bluesky, X and LinkedIn.
We can’t wait to see you there!
In collaboration with Romina Colman
How can AI help non-technical users validate and improve the quality of their data in the Open Data Editor, taking into account transparency, privacy, and functionality?
In this blog post, we reflect on the collaboration, process and outcomes of integrating an AI feature into the Open Data Editor (ODE) to help its users better understand their tables of data. We describe the challenges for which AI could provide a solution, our exploration of potential AI features, and the first implemented AI feature to help users better understand their data. We reflect on this integration, and finally outline the roadmap for further AI features for ODE to further improve its functionality and user experience.
The current functionality of the Open Data Editor is aimed at providing “data validation and basic cleaning” capabilities to improve the quality of data in tables. In plain language, ODE checks for errors in tables according to specific rules. In ODE, these rules are defined by Frictionless Data, an OKFN initiative that provides standards and software implementations to improve data quality and interoperability.
ODE is a unique tool in that it offers these capabilities to those non-technical data practitioners who typically analyse individual data files or data from public sources in a more ad hoc manner. Existing ‘data observability’ tools, such as Metaplane and Monte Carlo Data, are typically aimed at a technical audience to facilitate robust integration into large-scale data pipelines. Building a data preparation tool for non-technical users remains a major challenge and requires special attention to the interface, level of abstraction, and interactions. Writing the code is therefore only one side of the coin. A combination of soft and technical skills is needed to ensure that complex technical terms and feature implementations are understandable and transparent to those who are not necessarily exploring how something that looks simple, such as an AI button in the app, works.
The collaboration between the ODE team and myself, Madelon Hulsebos, as an AI Consultant, was prompted by a desire to explore how Artificial Intelligence (AI) features could be used to enhance the core functionality of ODE and help users.
As a first step, the team’s Product Owner, Romina Colman, and I met to discuss the status of the Open Data Editor and the shortcomings of related tools that the team had identified through a survey. We found that it can be difficult for users to understand how to use the tool and how to interpret its interactions (e.g. error messages). We also concluded that for non-technical users of ODE, it is important to provide transparency about what is happening, why and how, and to ensure the privacy of the user’s data.
The key question, therefore, is how AI can enhance the ODE’s ability to validate and improve data quality in a way that is transparent, privacy-preserving and trustworthy to a non-technical audience.
The Open Data Editor team had initially identified 3 ideas:
Based on this, it seemed that the first idea, semantic metadata refinement, such as “column descriptions” and “descriptive column names”, was at the core of ODE’s capabilities and that this feature would significantly help users to better understand their data.
I reviewed the ideas generated by the ODE team and enriched them with further suggestions that would improve the core functionality as well as some ideas for extending the functionality of the application. The ideas were described along the following list of dimensions:
I proposed three additional AI features that would assist users in using the ODE by 1) interpreting error messages, 2) answering questions about the ODE documentation, 3) summarising and contextualising the table metadata generated by ODE.
In addition to these features, I identified three other features that would extend ODE capabilities:
These features will proactively guide a non-technical user in validating and improving the quality of their data with ODE, extending its current capabilities.
The team met to reflect on the ideas from different perspectives: backend, frontend, and product. The aim of this meeting was to come up with a list of priorities and an action plan. The ODE team prioritised four ideas based on functionality, in order:
The team crystallised these feature ideas and I provided additional input based on open questions.
Based on insights from a few testing sessions with community members, the implementation effort versus the release timeline, and coordination with the Frictionless community, the team decided to start improving table metadata by suggesting improved column names and column content descriptions.
After identifying the AI feature that the team wanted to focus on first, they developed an initial implementation of the feature, taking into account the key values:
After providing instructions for the AI implementation, Romina tested it from the product side. Later, she had a meeting with me to ensure that the ODE team had not overlooked any relevant elements in the implementation. In fact, during this call, I noticed that the AI box, which asks the user to insert an OpenAI key, did not contain any references to explain how to obtain it. We added a link to the OpenAI documentation, and as OKFN has just launched a general course to help people work with open data, we asked the instructors to explain what a key is. We also added text and a link to allow users to check the terms and conditions of OpenAI. Finally, the link to this blog post will be added to the ODE’s user guide, so that people can also read more about the implementation process and decisions there.
The current pipeline for the AI feature is as follows:
[Note to readers: On 18 December, our team held the first group user test for the Open Data Editor stable release. One of the participants suggested changes to this message, such as including the name of the third party that ODE uses for AI integration (OpenAI), and some additional clarification regarding the steps that follow when the user clicks ‘Confirm’. We will be releasing a new version with these changes soon.]
Upon further review, we identified several revisions that were important for the first release of ODE with the integrated AI feature:
Integrating AI into the Open Data Editor can have significant value in providing a data quality validation and improvement tool that is accessible to non-technical users.
The key question was how AI can strengthen the Open Data Editor’s ability to validate and improve data quality in a transparent, privacy-preserving and trustworthy way.
In this blog post, we reflected on the process and outcome of the AI feature, and outlined a roadmap for future integrations of AI functionality in ODE. The team successfully integrated its first AI feature: using an LLM to generate enhanced column names along with column descriptions, which helps users understand their data and improve metadata. The implementation of the feature minimises the amount of data actually passed to the LLM: only the table column names are provided, ensuring privacy. The user is actively informed of what is being shared with the LLM, ensuring transparency. When sending the table metadata to the LLM, the prompt is preset in ODE, while the LLM call restricts the generated metadata to be formatted in a structured way, ensuring trustworthy output.
Overall, the final AI feature strengthens the core of ODE by helping users better understand their data before anything is done with it, taking into account the key values of transparency, privacy and trustworthiness.
X11R1 on a Sun/1 |
Evolving the Vnode Interface: Fig. 11 |
This simple module can use any file system as a file-level cache for any other (read-only) file system. It has no knowledge of the file systems it is using; it sees them only via their opaque vnodes. Figure 11 shows it using a local writable ufs file system to cache a remote read-only NFS file system, thereby reducing the load on the server. Another possible configuration would be to use a local writable ufs file system to cache a CD-ROM, obscuring the speed penalty of CD.Over the next quarter-century the idea of stacking vnodes and the related idea of "union mounts" from Rob Pike and Plan 9 churned around until, in October 2014, Linus Torvalds added overlayfs to the 3.18 kernel. I covered the details of this history in 2015's It takes longer than it takes. In it I quoted from Valerie Aurora's excellent series of articles about the architectural and implementation difficulties involved in adding union mounts to the Linux kernel. I concurred with her statement that:
The consensus at the 2009 Linux file systems workshop was that stackable file systems are conceptually elegant, but difficult or impossible to implement in a maintainable manner with the current VFS structure. My own experience writing a stacked file system (an in-kernel chunkfs prototype) leads me to agree with these criticisms.I wrote:
Note that my original paper was only incidentally about union mounts, it was a critique of the then-current VFS structure, and a suggestion that stackable vnodes might be a better way to go. It was such a seductive suggestion that it took nearly two decades to refute it!Nevertheless, the example I used in Evolving the Vnode Interface of a use for stacking vnodes was what persisted. It took a while for the fact that overlayfs was an official part of the Linux kernel to percolate through the ecosystem, but after six years I was able to write Blatant Self-Promotion about the transformation it wrought on Linux's packaging and software distribution, inspired by Liam Proven's NixOS and the changing face of Linux operating systems. He writes about less radical ideas than NixOS:
So, instead of re-architecting the way distros are built, vendors are reimplementing similar functionality using simpler tools inherited from the server world: containers, squashfs filesystems inside single files, and, for distros that have them, copy-on-write filesystems to provide rollback functionality.Since then this model has become universal. Distros ship as a bootable ISO image, which uses overlayfs to mount a writable temporary file system on top. This is precisely how my 1989 prototype was intended to ship SunOS 4.1. The technology has spread to individual applications with systems such as Snaps and Flatpak.
The goal is to build operating systems as robust as mobile OSes: periodically, the vendor ships a thoroughly tested and integrated image which end users can't change and don't need to. In normal use, the root filesystem is mounted read-only, and there's no package manager.
SEGA's Virtua Fighter on NV1 |
by David. (noreply@blogger.com) at January 05, 2025 08:38 PM
Win free books from the January 2025 batch of Early Reviewer titles! We’ve got 161 books this month, and a grand total of 2,766 copies to give out. Which books are you hoping to snag this month? Come tell us on Talk.
If you haven’t already, sign up for Early Reviewers. If you’ve already signed up, please check your mailing/email address and make sure they’re correct.
The deadline to request a copy is Monday, January 27th at 6PM EST.
Eligibility: Publishers do things country-by-country. This month we have publishers who can send books to the US, the UK, Canada, Israel, Netherlands, Italy, Latvia, Lithuania, Luxembourg, Malta and more. Make sure to check the message on each book to see if it can be sent to your country.
Thanks to all the publishers participating this month!
Akashic Books | Artemesia Publishing | Autumn House Press |
Baker Books | Bellevue Literary Press | Bethany House |
Boss Fight Books | CarTech Books | Cinnabar Moth Publishing LLC |
City Owl Press | CMU Press | Crooked Lane Books |
Gefen Publishing House | HTF Publishing | Inlandia Institute / Inlandia Books |
Kinkajou Press | Legacy Books Press | NeoParadoxa |
New Door Books | Prolific Pulse Press LLC | PublishNation |
Restless Books | Revell | Riverdale Avenue Books |
Rootstock Publishing | Running Wild Press, LLC | Simon & Schuster |
Somewhat Grumpy Press | Tundra Books | Type Eighteen Books |
University of Nevada Press | Unsolicited Press | Vesuvian Books |
Vibrant Publishers | What on Earth! | Wise Media Group |
Yorkshire Publishing | Zibby Books |
One of the very first issues of Thursday Threads was on data centers (2011). That issue had articles on a major Amazon Web Services outage, remote data centers powered by renewable energy, and videos about Google's and Meta's data centers. Unfortunately, I've found that the videos are lost to time. It is interesting that the concerns about data centers lives on. This post continues that thread with these topics:
Also recently on DLTJ:
Feel free to send this newsletter to others you think might be interested in the topics. If you are not already subscribed to DLTJ's Thursday Threads, visit the sign-up page. If you would like a more raw and immediate version of these types of stories, follow me on Mastodon where I post the bookmarks I save. Comments and tips, as always, are welcome.
There has been much written about how data center development is moving into areas where it can soak up cheap excess electricity. This is the first I've heard about how data center power draws can distort or even harm the grid for existing customers. Set aside about how the article is framed as another way that the creation of AI-driven products is harmful; data centers are going to be built no matter what the purpose. As the nation's power grid is restructured to incorporate more renewable source and power storage mechanisms, this is yet another factor that will make that transition more challenging.
It isn't just power; water use — primarily for cooling — is also a concern: "Artificial intelligence technology behind ChatGPT was built in Iowa — with a lot of water".
The article highlights the tension between data center development and community concerns, drawing parallels between the historical oil industry and the current rise of bitcoin mining in the region. Despite claims from the developer that their operations will stabilize the grid, critics argue that cryptocurrency mining is a drain on resources and exacerbates noise pollution. Residents want local governments to address the issues where noise has led to health issues and disrupted lives.
It isn't just rural areas, either; suburban Virginia and downtown Chicago are also affected.
Amazon Web Services has joined other tech giants like Google in investing in small modular nuclear power, reflecting a growing interest in nuclear energy among major companies. The interest in small modular reactors stems from increasing energy demands, particularly from data centers, and the challenges of relying solely on renewable energy sources. While renewables are cost-effective, their intermittent nature and grid connection issues limit their viability for continuous power needs. The lengthy and costly construction timelines of large reactors further complicate the situation, making small modular reactors a more appealing option despite their unproven status.
Don't count out Microsoft...it wants to restart and refurbish Three Mile Island reactors as part of its data center energy plans. Key permits are still needed before this is fully in place, though.
The rise of data centers in Africa will hopefully solve economic disparities and enhance digital sovereignty on the continent. For instance, improved proximity to data centers will lower transit costs for internet service providers, potentially boosting online economic activity. (Historically, African data has been stored internationally, leading to slower connections and complicating compliance with local privacy laws.) Connectivity — especially below the equator — remains an issue, though; "Google and Meta’s underwater cables up the stakes on internet control"
Although some African countries welcome the new data centers, others are concerned. Chile, for instance, has "multiple groups working to keep Amazon, Google, and Microsoft from doubling the number of centers in the country, fearing environmental devastation". And a a Norwegian ammunition manufacturer blames ‘storage of cat videos’ for threatening its growth.
We have a temporary third cat, Pickle, in our home that is very food driven. It will bully the other cats away from their food. (Pickle is also known for stealing and eating whole chocolate chip muffins from the breakfast table, too.) So we added one of those microchip-enabled pet doors to this plastic tote so our first cat, Mittens, can eat in peace.
I entered 2024 on the brink of burnout. Maybe a bit past the brink. I didn’t feel secure in my job; I was still exhausted from a rushed move at the start of the year; I ignored the fact that I was already busy and tired and went for my Maine Master Gardener Volunteer certification anyway; and, you know, there’s that whole pandemic thing where I was (still am) living the 2020 lifestyle and feeling very alone in that.
I want to explain the job insecurity, but if you don’t care about that (and you shouldn’t, it’s a downer), please feel free to skip this paragraph! By the start of 2024, I knew my job really well, inasmuch as it was knowable (the boundaries of my role were pretty squishy). I’d done some worthwhile things, impressed some folks, built some strong relationships, and learned how to get information in an organization where news was rarely shared via official channels. I was doing good work that I was proud of and that my stakeholders were happy with. But my position felt precarious because, even though I was hired to work remotely and will probably always need to stay remote, 1) my manager had tried to pull me into in-person work and insisted I should apply for a formal accommodation with HR if I wanted to avoid on-campus work or to be kept safe with any mitigations beyond my own mask during the six campus visits she could require I make per year. I ended up with a written agreement that I was officially 100 percent remote (no required visits at all) for now, but aside from being allowed to mask, I would be unprotected if I ever visited campus (meaning that my manager could require long meetings in small, unventilated, overcrowded rooms with no breaks, and I was not allowed to ask others to mask); also, I had to reapply yearly, with a new doctor’s note each time. The process was dehumanizing and felt like it would inevitably fall through because 2) the associate director of HR who managed my case was visibly unenthusiastic about remote work/employees and also responded mockingly to all of my doctor’s requests for mitigations to improve safety. It’s worth saying: I had the full support of the library’s new director and every coworker I ever spoke to besides the HR rep and my manager (and after our director stated his support, my manager’s opinion also seemed to change). I genuinely believe my director would have done his best for me; but ultimately, I couldn’t know how much power he would have had to overrule HR.
Given my feelings of precarity and some other issues that arose while I was in that role, I feel incredibly fortunate to have, after a short search, found a new position that feels like a great fit! I was able to give 3+ weeks of notice, staying through the first week of fall semester at my old job, which left them in a good spot for the school year. I took three weeks for myself, and at the beginning of October, I started my position at Colorado State University Libraries as a Developer & Systems Administrator. (Which means I get to build my sysadmin skills! I’d happily accept advice on how to do that!)
I know it’s early to say this with any certainty, but so far, I feel like CSU Libraries have lived up to all of my hopes. Lots of people have hybrid schedules, and two others are fully remote, so meetings are generally online for maximal inclusion. My manager knows about and is cool with my health/disability situation; she also does a great job running a remote/hybrid technology team, including regular informal chats; the two other people with the same title I have(!!) are so smart and skillful but also extremely kind and patient; the whole team is fantastic and helpful and super sharp; the colleagues I’ve met from across the Libraries are delightful; our dean has a technology background (so she understands the value of our work) and is thoughtful about power structures; and there are not only multiple vocal advocates for accessibility, but the library’s leadership puts real time and resources into DEIA efforts. On the technology side, we run our own servers and a lot of infrastructure I haven’t been able to touch in previous positions, which is a cool new challenge. It’s great. So many things I’ve heard other libraries say “can’t be done” are being done, here, and I’m glad to be a part of it! Now I just want to get up to speed as quickly as I can so I’m a productive part of it.
As a bonus, my hours are generally 10:30am-7pm (my choice, they’d have worked with me if I wanted to be more on Eastern time), which fits so much better with my natural circadian rhythm than east coast work ever did. They also believe in flexibility for their workers, so I have room to shift my schedule as needed on individual days (say, to sleep through an afternoon migraine and make up the time in the evening), or potentially to negotiate different starting and ending times on different weekdays if needed. Of course there are scheduled meetings and goals/expectations and all that — it’s a job — but I genuinely haven’t caught any whiffs of the presenteeism you see in so many academic and business institutions.
So, I mean, yeah, I’m probably still burnt out. That doesn’t go away quickly. But I think I’m in a good, sustainable situation, job-wise, which is going to help.
It also doesn’t hurt that I’ve made it through my Maine Master Gardener Volunteer (MGV) traineeship. Or, well, mostly. I finished all ~40 hours of lessons (October 2023 – March 2024); I’m at 38.25 hours of the 40 I was meant have volunteered in 2024; and my coordinator says they won’t throw me out for not hitting 40 on the dot. (Most MGVs are retired. Not a lot of us try to do it on top of full-time jobs. And I think I’m the only one who does it entirely virtually, so I’m not counting any travel time or anything.) Next year, and every year thereafter that I want to maintain my MGV certification, I only need 20 hours — which still sounds exhausting to me, right this second, but is certainly more achievable than 40 + lessons.
In all that time volunteering and changing jobs and everything, I completely neglected my own yard and garden. The work we paid for in 2023 ended up being pretty bad, alas, and we haven’t really had the heart (or the energy) to clean it all up, beyond filling in the hole after our pond liner floated out of it last winter. (Seriously, the company we hired? Do not recommend, at least not for ponds or lawns, or really trees. The whole thing was a mess, start to finish.) We’ve made a little progress, and I hope to make more—get those elevated beds up so I can grow things—but we’ll see.
In sad news, we said goodbye to our 17 year old chinchilla, Princess Eleanor Rubidium Chinchillington, III (a.k.a. Ella). She had a genetic condition where her teeth grew in both directions, not just up from her jaw, but down into and through it; it’s not something they can do surgery for, so it’s always eventually fatal. We kept her as comfortable as we could, with pain meds and squishy food twice a day, and in her last months of life she enjoyed total run of Dale’s room (since she had stopped chewing on things, she didn’t have to be supervised to be out) and several times broke containment and enjoyed the life of a small, fuzzy criminal, running rampant through the whole house. She passed quietly, while being cradled in a blanket by her favorite person in the world.
Our other pets—Phoebe, Pumpkin, Hermann, and Newton—are mostly doing well. Pumpkin was named Chubby Bird of the Day (Facebook link) back in April. Phoebe is a very old man who needs pain meds for his foot each day and sleeps more than he’s awake, but he toddles around to where he wants to go and makes happy beak noises every day. Pumpkin checks on him first thing every morning, which is adorable, and (sometimes) yells at us when he knows Phoebe needs us for something. (He also yells for no reason, so. Maybe we’re projecting intention, here.) And the budgies annoy them both whenever they’re all out.Other notable things that happened this year: watching a rescued baby seal being released back into the ocean; traveling just a couple of hours away to see the eclipse in totality; speaking at Solstice School; and experiencing the best aurora borealis of either of our lives, ever, including when we lived in Alaska (that’s the hero image for this post); and making an excellent crocheted “Medusa” hat with cartoony snakes for Halloween. By several measures, it was a good year.
I’ve signed up for online American Sign Language classes. (The organization is called “Queer ASL,” and the courses center 2SLGBTQIA+ people and experiences, but allies are welcome and invited to take the courses too!) Just as the herbal classes I took online in 2024 (the Zoom sessions of Wild Cherries Year 2: Racemes of Delight; they’re in the Pittsburgh area, so I couldn’t join anything in-person) didn’t add to my feelings of exhaustion, I’m finding that studying ASL is also energizing rather than draining. It uses a different part of my brain than work or chores, and (as I might talk about in another post coming up) it feels like one of the “little good things” I can do that might make the world a better and more inclusive place in the coming years.
I’m doing a self-paced training on herbs for chronic illness with an herbalist I respect, but because it’s entirely self-paced, I’m doing it painfully slowly. (There’s a pun there, probably.) I’d better finish that in 2025; I think I only have access for a year?
And in terms of work-related trainings, CSU paid for access to Practical Accessibility for each Dev/Sysadmin (and for our new User Experience Professional!), and I made good progress on it during my first three months on the job. I’ll finish that in 2024, and then I’ll start looking for Linux / Systems Administration / Cybersecurity training.
A thing I love about the Stonefruit and Wild Cherries community in Pittsburgh—and probably part of the reason those classes were more soothing than tiring—is that the people involved are so caring for and careful with each other. In addition to a number of other caring behaviors, they kept each other safe by holding their in-person classes outdoors, requiring a negative COVID-19 test before arrival, and everyone masking while indoors. I would love to find such a caring community up here in Maine, enough so that the temptation to start one keeps growing. (Hear me out: there’s the Feminist Bird Club, which has no chapters here, and also Birdability, which has no captains I’m aware of here; a hybrid of the two would be a great addition, right? Our local Audubon chapter already offers accessible birding events, so I bet they’d be supportive.)
If I don’t find (or create) any kind of formal community, I will at least spend more time hosting friends, I hope. We lit our fire pit on December 21, and a couple of friends stopped by. It was much too cold to be outside, but the company was great! We’ll hopefully do more of that, during more pleasant weather.I also plan to be more intentional with my time, cutting down on social media and replacing the “ambient news” part of it with RSS feeds and the socialization part by scheduling time to talk with people, or emailing them, or writing letters. I found an RSS reader that will let me follow the Facebook pages my town uses for official news (I wish I were joking), so that’s one problem solved. The hope is that I’ll gain back some time for reading, gardening, and other things that benefit my mind and spirit more than scrolling — or if I don’t gain back time, because I’ve scheduled so much for socializing, then at least the connections I have with others will be more meaningful.
I won’t belabor this point, but I am expecting a lot of turbulence in the coming years. I’m proud of us for getting through 2024, and I hope we will each do our part to get as many of us through 2025 as possible. Keeping mind, body, and spirit together can be a lot, and when we manage it, it’s something to celebrate.
My motto going into 2025 is the same as it’s been since 2017 or so: brace for tough times, but don’t let go of hope. Make the world better in whatever small ways you can.
A monthly round-up of news, upcoming working group meetings and events, and CLIR program updates from the Digital Library Federation. See all past Digests here.
Happy New Year! We hope everyone is staying healthy throughout the winter.
— Team DLF
For the most up-to-date schedule of DLF group meetings and events (plus NDSA meetings, conferences, and more), bookmark the DLF Community Calendar. Meeting dates are subject to change. Can’t find meeting call-in information? Email us at info@diglib.org. Reminder: Team DLF working days are Monday through Thursday.
DLF groups are open to ALL, regardless of whether or not you’re affiliated with a DLF member organization. Learn more about our working groups on our website. Interested in scheduling an upcoming working group call or reviving a past group? Check out the DLF Organizer’s Toolkit. As always, feel free to get in touch at info@diglib.org.
Below are some ways to stay connected with us and the digital library community:
The post DLF Digest: January 2025 appeared first on DLF.
A short announcement attached to this year’s blog review post. Move off of WP.com to DreamHost I posted a while back about moving off of WordPress.com, and I’ve finally done the official switch over. It took a while to get the domain transferred, and then it was close enough to the end of the year … Continue reading "2024 Blog Year in Review + Hosting Provider Change"
I've had a couple of people ask for advice on LIS courses recently, so I thought it might be useful to write up some thoughts as a blog post in case anyone else has similar questions. To situate my perspective: I completed a Graduate Diploma in Information Management at RMIT University in 2003. I've worked in public libraries, for the library-owned cooperative CAVAL, and in an academic library. In that time I have supervised placements for both diploma and higher degree students. I currently work at a university but it's more than two decades since I received my librarianship qualification so I'm not across what is taught in the LIS curriculum now.
This post is primarily for people intending a career in libraries, but most of it applies if you're thinking of working as an archivist or records manager.
As soon as you start talking to people about a possible new career in libraries or archives you will hear misinformation. Many people – including those in the industry already – have strong assumptions about what librarians and archivists do, and what their career prospects are likely to be, without necessarily having subjected those opinions to any rigorous testing. If anyone tells you they know what the career prospects for librarians will be in five or ten years, don't believe them. Nobody knows – including the university recruiters who will tell you whatever they think you need to hear to enrol.
No particular "personality type" suits library work – it's a broad field and there's something for everyone. If you love talking to people and working in a big team you could have a great time in public libraries or in academic library liaison work. If you love quiet space and getting into the detail of things, metadata or systems work could be for you. Thrive on short deadlines and pride yourself on your thoroughness? Maybe a career in legal or medical libraries awaits. School libraries aren't for you if you don't like children or teaching, but a corporate library career could be just the thing. But these are broad strokes – the truth is that all these roles and libraries need people with many different skills and inclinations, and sometimes require people to work against their skill set. Technical specialists need to be able to communicate effectively with those working in client-facing roles. Public library staff often need to be able to pull off an entertaining storytime on the same day they help someone with a local history query. Staff answering reference queries won't get far if they don't understand how library metadata systems work.
Contrary to popular myth, you won't get paid to read books all day, and outside of some niche library and archive roles you won't even spend much time actively helping researchers to find relevant books and papers for their latest project. But you certainly might enjoy a rewarding career with interesting challenges. If you love sending and receiving email, LIS could also be a great choice.
You may not need an additional qualification to get started in LIS if you already have qualifications and experience in a related field – and you might be surprised by what is related. Don't assume you need to already have your library qualification before you can find a paid role in the industry. If you have qualifications and experience in education, computer science or information technology, publishing, law, health and medical sciences, or academic research, you may be able to get a start in a professional-level role straight away. Having said that, it greatly depends on both the role and the attitude of the employer - so you should always check before spending time applying. You also will have very limited opportunities without a formal LIS qualification.
A formal LIS qualification is helpful (and often required), but working out which qualification you need can be confusing. The four types of LIS qualification are:
These courses all include a compulsory work placement.
After a period of consolidation there are now only three universities offering degrees - Charles Sturt University, Curtin University, and the University of South Australia. Diplomas are available from various TAFEs in New South Wales, Victoria, South Australia, Western Australia, Queensland, and Fiji. Courses you may see at Monash and RMIT are being taught out and are not accepting new enrolments.
Generally speaking, to gain a Diploma level qualification you study at TAFE rather than university. This has two big advantages - greater access to courses, and lower fees. A Diploma of Library and Information Services takes one year and costs around $12,600 for a government-funded place. A Bachelor degree would take three years and cost around $16,000 per year. A Diploma will generally give you credit towards a later Bachelor degree, so it's a common first step for people who aren't sure they want to fully commit, or simply can't afford to spend three years studying full time.
However, there's a catch - a Diploma qualifies you to be a "Library Technician" rather than a librarian. This is what is sometimes referred to as a "para-professional" role. My personal view is that these roles perpetuate a class divide within the profession and are used to underpay highly skilled professionals. The important thing you need to know is that if you hold only a Diploma you will not be eligible to apply for most qualified librarian positions. When you start looking for library jobs you may notice some position descriptions require applicants to "hold a degree conferring eligibility for Associate Membership of ALIA" or similar wording. This means Bachelor and higher degrees - a Diploma only confers eligibility for "Library Technician" or "General" membership. The Bachelor of Information Studies is available through Charles Sturt University. If you do not already hold a university degree in any other discipline, and you're sure you want to become a fully qualified librarian or archivist, this could be a good option. Opting for a Diploma and a career as a Library Technician can also be a reasonable choice - just make sure you know what you're getting into.
If you hold a university degree in any discipline, you have more options. The most common path is the (one year full time, two years part-time) Graduate Diploma. Some people choose to complete a full Masters degree, which takes another semester, or two if you enrol in Curtin's "extended" degree. You can usually "upgrade" from a Grad Dip or "exit early" from a Masters course with a Graduate Diploma, but you should check this before enrolling.
The advantage of the Grad Dip is that you end up with a qualification as a librarian in half the time. Why then, would you complete a Masters? Firstly, a Masters allows you to explore more options - most of the Masters courses qualify graduates as a librarian an archivist, or a records manager, depending on which electives and specialisations they have chosen. This is useful if you think you might be interested in either career path but need to explore a bit more to decide which one. The second reason is that the Masters degree is more widely recognised internationally. Specifically, in the United States generally only holders of a Masters degree are recognised as being fully qualified librarians. If you're considering working in the USA at some point, a Masters qualification will make things easier.
The hardest of the hardcore enrol in a Master of Education (Teacher Librarianship). Look forward to an exciting career keeping up to date with two different sets of professional knowledge simultaneously, whilst embodying the enduring mental picture of what a librarian is for generations of children. No pressure!
You may not be able to secure a Commonwealth-Supported place, for a Graduate Diploma or Masters, so they can be fairly expensive. If you're an Australian citizen you are still eligible for a FEE-HELP (HECS) loan.
Students are seen as the future of the GLAM professions, so professional organisations offer generous membership rates and opportunities. Make sure you take advantage of these, because they will not last beyond your degree. Attending events, meeting people and reading reports and papers are all excellent ways to develop your knowledge and professional networks even if you are not yet working in the sector.
All GLAM professional organisations offer a heavy discount for student members:
Or to put it a different way, you could join all of the above organisations as a student member for $225 a year – less than the cost of joining one of ALIA, ASA, or RIMPA as a professional member.
Joining these organisations gives you cheap or free access to events like webinars, workshops and meet-ups, as well as some training, PD and professional literature access. Some conferences also offer free tickets to student volunteers, and heavily discounted tickets for students. Most importantly, you'll be able to meet people working in the industry and build connections to future colleagues and employers.
Every degree includes a compulsory work placement - usually two weeks, sometimes three. The quality of your placement experience may vary depending on how seriously the host institution takes placements, the time of year, your needs, and how well matched you are. The first thing you need to be aware of is that LIS placements are always unpaid, and usually your educational institution will not let you do your placement in your existing workplace if you already have a library job. You might think that if the point is to give you real-world experience and you already work in a library, you should be able to just get credit for that. You'd be right, but that's not how it works.
Whether you already have some kind of paid or volunteer library job or not, the work placement is a good opportunity to see what things look like in a real workplace that you're unfamiliar with. If you know all about storytime at the public library, consider getting a placement in a research archive just to see if you like it. If you're sure you want to be a metadata specialist, consider asking to be placed with an outreach and information literacy team. Lots of students surprise themselves partway through their degree and end up on a different career trajectory to the one they expected – placements are a great opportunity to try something new.
Some institutions expect the students to organise their own placements, or at least to identify somewhere they would like to be placed. Hosting student professional placements can be resource-intensive for the host institution. Whilst in theory students are contributing to real work, host institutions have to organise some onboarding, supervision, and usually some kind of special project that can be completed part time in two weeks. This is a lot of work for us! You can make things easier by:
As well as studying hard, joining and participating in professional organisations, and preparing well for your placement, there are some other things you can do to prepare for a LIS career. Try to keep a cool head – you're going to receive a lot of advice, both solicited and unsolicited, about the best way you should or should not build a professional profile. The most important thing is to choose something that works for you and feels reasonably natural. Some people embrace social media and LinkedIn. Others never post online but join committees and volunteer at conferences. Others create a profile by writing articles or blogs.
Given we're information professionals, I usually encourage LIS students and graduates to register a personal domain name and create at least a basic personal website where you can post a professional biography and links to publications, social media and so on. Prospective employers will often do a web search for you, and it's good to consider which site you want to appear at the top of the list.
The other thing you need to keep in mind when meeting people is that it is a very small industry. Chances are high that you will meet these people again - at a conference, in an office, or across an interview desk. Try to remember this before you launch into your newly-formed opinion about a library service or professional colleague. You might be right, but it might not be what you want to be remembered for. I'm telling you this as a highly-opinionated friend.
Lastly: be careful about volunteer work. There's nothing wrong with volunteering your time for public service, as long as you enjoy the work and/or spending time with the other volunteers. It is easy, however, for volunteering to tip into exploitation of the volunteers and be used as a pressure point against paid workers. If you have no experience in the industry and want to "get your foot in the door", it's hard to avoid doing unpaid work, but take a broad view and consider who is getting the most value out of your experience. Every "volunteer" position that requires the same skills and knowledge as a paid role is one fewer of the paid positions you are hoping to secure.
Updated with corrections 1 January 2025.
AI revolutionizes manufacturing with predictive maintenance, automation, and data-driven insights, enhancing efficiency and innovation.
The post How AI Is Revolutionizing The Manufacturing Industry appeared first on Lucidworks.
The depths of the Depression would seem an unpromising time to revive the 1929 song “Happy Days Are Here Again”. But after it was played at the 1932 Democratic convention, it caught on as a song of hope for better things to come. And after years of work and struggle, prosperity returned.
Tomorrow this song and the rest of our #PublicDomainDayCountdown works join the public domain. And while tough times may still threaten, with work, struggle, and hope we too may bring happy days here again.
Credit: XKCD |
a rude word meaning to try to persuade someone or make them admire you by saying things that are not trueThe essence of successful bullshit is that it should be both plausible and presented authoritatively. Bullshitters are always tempted to buttress the appearance of authority by including actual evidence rather than just their interpretation of the evidence, but this is often a fatal mistake. Below the fold I discuss a classic example from MAGA's campaign to demonize immigrants.
𝐓𝐡𝐢𝐬 𝐢𝐬 𝐭𝐡𝐞 𝐦𝐨𝐬𝐭 𝐢𝐦𝐩𝐨𝐫𝐭𝐚𝐧𝐭 𝐩𝐨𝐬𝐭 𝐲𝐨𝐮’𝐥𝐥 𝐞𝐯𝐞𝐫 𝐫𝐞𝐚𝐝 𝐨𝐧 𝐦𝐚𝐬𝐬 𝐢𝐦𝐦𝐢𝐠𝐫𝐚𝐭𝐢𝐨𝐧 𝐢𝐧𝐭𝐨 𝐭𝐡𝐞 𝐔𝐧𝐢𝐭𝐞𝐝 𝐒𝐭𝐚𝐭𝐞𝐬:Who is GabeGuidarini?
In 1924, President Calvin Coolidge signed the Johnson-Reed Immigration Act, which halted mass immigration into the United States.
The act completely stopped immigration from Asia, and set strict quotas on immigration from other places including Europe.
Wages began to dramatically increase.
By the 1950s, America reached the peak of its economic and industrial might. Economic inequality was low. Large businesses and workers alike were prospering.
Then, in 1965 the Hart-Cellar Immigration Act was signed by President LBJ, opening the floodgates and allowing for massive unchecked immigration, especially from the global south.
Immediately, income growth and wage growth for the bottom 90% of earners came to an abrupt halt.
Soon after, the incomes for the top 1% of earners skyrocketed, leading to the massive wealth inequality in America we see today.
The political left attributes this to simply a lack of proper taxation and regulation, but that’s mainly not the case.
Mass immigration has allowed for modern scab labor to become the norm, allowing large companies to pay lower wages for foreign migrants who demand less.
The loser? America’s middle class, which has been deteriorating for more than half a century now.
When you see wealthy technocrats argue in favor of mass “skilled” immigration, keep this in mind.
Ohio Field Rep @TPAction_. Comms @OhioCRs. President @UDRepublicans. Fmr Acting President @uscollegegop.He is a Republican operative. His chain of causation sounds convincing, doesn't it?
In 1924, President Calvin Coolidge signed the Johnson-Reed Immigration Act, which halted mass immigration into the United States.Lets put 1924 on the graph.
The act completely stopped immigration from Asia, and set strict quotas on immigration from other places including Europe.
Wages began to dramatically increase.
By the 1950s, America reached the peak of its economic and industrial might.If America's "economic and industrial might" peaked in the 50s, why did the incomes of the bottom 90% continue rising until 1973?
Then, in 1965 the Hart-Cellar Immigration Act was signed by President LBJ, opening the floodgates and allowing for massive unchecked immigration, especially from the global south.Lets put 1965 on the graph.
Immediately, income growth and wage growth for the bottom 90% of earners came to an abrupt halt.
the incomes for the top 1% of earners skyrocketed, leading to the massive wealth inequality in America we see today.Can you see when the "incomes for the top 1% of earners skyrocketed"? That's right, it was in 1985, so "soon" means 20 years! And can you guess who was President of the US, and from what party, then? Right, it was a certain Ronald Reagan, a Republican.
by David. (noreply@blogger.com) at December 31, 2024 04:00 PM
Inspired by Tom Whitwell's 52 things I learned in 2022, I started my own list of things I learned in 2023. Reaching the end of another year, it is time for Things I Learned In 2024:
Other lists:
The Disney studio had a productive year in 1929. Along with releasing 12 new Mickey Mouse cartoons, it began a series of one-shot musical cartoons with animation designed to fit the music, instead of the other away around. The “Silly Symphony” series began with “The Skeleton Dance”, a creepy graveyard cartoon set to music Carl Stalling wrote after Disney couldn’t get rights to Saint-Saëns’ Danse Macabre. Watchable online now, it rises to the public domain in 2 days. #PublicDomainDayCountdown
2024 marked a significant year for Open Knowledge Nepal (OKN) as we strengthened data-driven governance and fostered a culture of open data and collaboration across Nepal. As the year comes to a close, we reflect on our major initiatives and achievements that paved the way for a more transparent and innovative data ecosystem.
This year, OKN addressed the persistent challenges of data silos and fragmented datasets in Nepal’s local governments through the Integrated Data Management System (IDMS) for Local Government project. Supported by The Asia Foundation’s Data for Development (D4D) Programme, this initiative provided comprehensive technical and non-technical support to enhance data utilisation at the local level.
In its fourth phase, the project emphasised data-driven decision-making, empowering municipalities with localised data management systems and knowledge. It is currently implemented in five municipalities – Birgunj Metropolitan City, Tulsipur Sub-Metropolitan City, Janakpurdham Sub-Metropolitan City, Lekbeshi Municipality, and Suddhodhan Rural Municipality – laying a strong foundation for sustainable data practices and improved governance.
As part of the Women in Data Conference 2024, OKN and The Algorithm organised Data Hackdays in Tulsipur and Birgunj on August 25 and September 1. These events brought together local governments and community members to explore the potential of data through the lens of the IDMS. Participants developed data stories addressing critical themes like Women’s Statistics, Health, Environment, and Education.
These hackdays highlighted the importance of evidence-based decision-making while fostering collaboration between local governments and communities. They showcased the growing involvement of women in data, emphasising the need for sustained efforts to enhance data literacy and public engagement.
Since its launch in 2018, Open Data Nepal has worked to make Nepal’s data permanently accessible online. In 2024, OKN initiated a comprehensive revamp of the portal to address fragmented and non-machine-readable datasets. The updated version, set to launch in 2025, will feature improved accessibility, user-friendly design, and advanced tools for data visualisation and exploration, further empowering citizens, researchers, and developers.
In 2024, OKN leveraged the IDMS to develop a sustainable Local Government Data Profile (LG Profile). This initiative aims to streamline data processes, enhance decision-making, and set a benchmark for data-driven governance. Addressing challenges like capacity gaps and vendor-locked systems, the LG Profile emphasises scalability, interoperability, and greater data ownership by local governments.
Currently being implemented at Tulsipur Sub-Metropolitan City, the LG Profile showcases how municipalities can adopt innovative, sustainable data practices to strengthen governance and resource allocation.
2024 was a vibrant year for OKN, filled with impactful events and collaborations aimed at promoting open data, inclusivity, and capacity building.
In 2024, OKN collaborated with leading organisations, including the Open Knowledge Foundation, The Asia Foundation’s D4D Programme, the Women in Data Steering Committee, Accountability Lab Nepal, and local governments. These partnerships strengthened Nepal’s open data ecosystem, fostering innovation and transparency.
As we close 2024, we extend heartfelt gratitude to our partners, collaborators, and the community for their unwavering support.
2025 promises to be a year of growth and impact, with key events like the PublicBodies Datathon, Women in Data Conference, and Open Data Day(s) on the horizon. We will also continue advancing projects like IDMS, Open Data Nepal, and the LG Profile, driving innovation in Nepal’s data ecosystem.
Expanding our reach, we look forward to launching regional projects focusing on the Asia region, aligning with the Open Knowledge Foundation’s strategic focus on The Tech We Want.
Here’s to a collaborative, sustainable, and open 2025!
“I was so happy. I was so safe,” laments Lois Farquar to a suitor late in Elizabeth Bowen’s The Last September. But from the book’s start, as she and her fellow Anglo-Irish gentry enjoy parties and dances, their Irish neighbors are fighting for independence from Britain, while they entertain British soldiers sent to suppress the rebellion. Unable to commit politically or romantically, Lois and her family lose much. Bowen’s novel joins the US public domain in 3 days. #PublicDomainDayCountdown
Lynd Ward’s Gods’ Man is a novel without words (apart from chapter titles) about an artist who makes a Faustian bargain with a masked stranger for artistic success. Told in 139 woodcuts, it was the first of 6 wordless novels by Ward, and the first American novel of its kind. Selling well when published in 1929, it influenced artists like Art Spiegelman and Will Eisner, who made graphic novels a genre of widespread ongoing interest. The public domain claims it in 4 days. #PublicDomainDayCountdown
Elmer Rice’s boisterous Street Scene wasn’t easily staged. Though set in front of a single tenement, it required over 30 actors, prompting many producers to turn it down. Rice eventually had to direct the first production himself. But that had over 600 performances on Broadway, won the 1929 Pulitzer Prize, and was later adapted into a film and an opera. A 2013 production staged it in the open on an actual New York street. The play opens in the public domain in 5 days. #PublicDomainDayCountdown
In 2010 Israel Katz called Abraham Zevi Idelsohn “the undisputed pioneer-scholar of Jewish music”. Part of Idelsohn’s claim to fame is his comprehensive survey Jewish Music in its Historical Development, written while he was cataloging the Eduard Birnbaum Collection of Jewish Music at Hebrew Union College. Covering Jewish music and its various influences from Biblical times to the early 20th century, the book was published in 1929, and joins the public domain in 6 days. #PublicDomainDayCountdown
In the midst a “cruel land, this South”, Christ makes an unexpected sort of appearance in Countee Cullen‘s long poem “The Black Christ”. It’s one of a number of poems in The Black Christ and Other Poems dealing with faith, injustice, sin, racial violence, and African American experience, among other themes.
The University of Missouri libraries has an exhibit of pages from the book, illustrated by Charles Cullen. The complete book comes to the public domain in 7 days. #PublicDomainDayCountdown
Most Christmas songs Americans are used to hearing on the radio were published after 1929. But most Christmas songs they’re used to singing in church are older. Much of the traditional American repertoire is in George Rittenhouse’s World Famous Christmas Songs, with 74 carols and nativity songs “specially arranged for popular usage in community caroling, school, chorus, church and home”. First published in 1929, and reissued in 1957, it’s in the public domain in 8 days. #PublicDomainDayCountdown
The dataset was collected with Documenting the Now’s twarc
using version twarc2 via the Academic Access v2 Endpoint. It contains 22,919,247 Tweets with the search term 🫡
from October 28, 2022 through February 18, 2023 (missing October 31, November 15, November 18, December 31, 2022, and January 7, 2023).
I wrote this back in April 2023:
Why not grab as many 🫡 emoji tweets while the platform is on fire? Seems like a fitting final use of Twitter Academic Research Access. I should probably just make that a Wayback link preemptively, eh? Who knows how long that page will be there at this rate!
Anyway, I was curious if there would be a big spike in the emoji’s usage on a few days during the hardcore timeline. After pulling the data, and plotting it. Doesn’t really look like it’s the case.
Here is the old Tweet Volume graph.
Here is a new one, which really exposes a misunderstanding I had in harvesting tweets with twarc2
via date ranges. This is also illustrated in the above chart if you look closely. 🤦
saluting-face-2022-10-28-2022-10-31.dat
saluting-face-2022-11-01-2022-11-15.dat
saluting-face-2022-11-16-2022-11-18.dat
saluting-face-2022-11-19-2022-12-31.dat
saluting-face-2023-01-01-2023-01-07.dat
saluting-face-2023-01-08-2023-02-19.dat
Using the full line-oriented JSON dataset and twut
:
import io.archivesunleashed._
val tweets = "saluting_face.jsonl
val df = spark.read.json(tweets)
df.language
.groupBy("lang")
.count()
.orderBy(col("count").desc)
.show(10)
+----+-------+
|lang| count|
+----+-------+
| en|9244552|
| ja|5781358|
| und|2211360|
| es|1090225|
| pt| 973493|
| ar| 726801|
| ko| 473167|
| fr| 351251|
| in| 327702|
| th| 290914|
+----+-------+
Using saluting_face-user-info.csv
from df.userInfo.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("saluting_face-user-info")
, and pandas
:
Using the full line-oriented JSON dataset and twut
:
import io.archivesunleashed._
val tweets = "saluting_face.jsonl
val df = spark.read.json(tweets)
df.mostRetweeted.show(10, false)
+-------------------+-----------------+
|tweet_id |max_retweet_count|
+-------------------+-----------------+
|1603224385477054465|147274 |
|1560231432207106048|86855 |
|1604556216889327617|76775 |
|1553755497014013953|71583 |
|1600843786640203783|66141 |
|1568676995743248385|47865 |
|1536619482281820160|43970 |
|1583007717253591041|38834 |
|1534770573804707840|38258 |
|1602561118056378368|37725 |
+-------------------+-----------------+
From there, we can use append the tweet ID to https://twitter.com/i/status/ to see the tweet. Here’s the top three:
147,274
[#방탄밤] 따뜻한 겨울 끝을 지나 우리 다시 만날 그날을 기다리며, 12월 13일 화요일 #방탄소년단 의 기록🎞️ 우리 진!! 잘 다녀와요!! 💜러뷰💜
— BTS_official (@bts_bighit) December 15, 2022
(https://t.co/kkrMEH3Xjo)#사랑한다우리진곧봅시다💕 #LoveUBro💞 #123다치지말자🫡 #WeLoveJin
86,855
リアクションがミニオンすぎる子供達をご覧下さい🫡(町探検に来てくれたのでオムライス食べて貰いました) pic.twitter.com/xWF5FXffQD
— オムライスのプロ🥚卵研究家 (@omuraisupuro) August 18, 2022
76,775
MESSI 🫡🐐👏🏾👏🏾
— LeBron James (@KingJames) December 18, 2022
Using saluting_face-hashtags.csv
from df.hashtags.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("saluting_face-hashtags")
, and pandas
:
방탄소년단 is BTS written in Hangul.
Using the full line-oriented JSON dataset and twut
:
import io.archivesunleashed._
val tweets = "saluting_face.jsonl
val df = spark.read.json(tweets)
df.urls
.groupBy("url")
.count()
.orderBy(col("count").desc)
.show(10, false)
+----------------------------------------------------------------------+------+
|url |count |
+----------------------------------------------------------------------+------+
|https://youtu.be/L-orDkbsuHk |129146|
|https://t.co/kkrMEH3Xjo |128719|
|https://twitter.com/ENHYPEN_members/status/1600843786640203783/photo/1|52397 |
|https://t.co/rJ8a1ygDru |52396 |
|https://t.co/F0tQ8nIOOZ |33000 |
|https://twitter.com/WayV_official/status/1613778127846801408/photo/1 |33000 |
|https://twitter.com/btsinthemoment/status/1602561118056378368/video/1 |31838 |
|https://t.co/Nr45zhQBNo |31248 |
|https://twitter.com/claricetudor_/status/1613609828340858919/photo/1 |30944 |
|https://t.co/TwIkLBAYWA |30944 |
+----------------------------------------------------------------------+------+
Using the full line-oriented JSON dataset and twut
:
import io.archivesunleashed._
val tweets = "saluting_face.jsonl
val df = spark.read.json(tweets)
df.mediaUrls
.filter(col("media_url").isNotNull)
.groupBy("media_url")
.count()
.orderBy(col("count").desc)
.show(10, false)
+-----------------------------------------------+-----+
|media_url |count|
+-----------------------------------------------+-----+
|https://pbs.twimg.com/media/FjdW7uqUoAED64t.jpg|52397|
|https://pbs.twimg.com/media/Fl-EBDrWYAADM8o.jpg|25855|
|https://pbs.twimg.com/media/FhFGxc_akAAz5p0.png|15859|
|https://pbs.twimg.com/media/FkTWga5WAAAOEqf.jpg|15529|
|https://pbs.twimg.com/media/FmSxlMFWAA095lk.jpg|15473|
|https://pbs.twimg.com/media/FmSxlMPWAAkVM_H.jpg|15473|
|https://pbs.twimg.com/media/FhAKydlakAAnYp2.jpg|14314|
|https://pbs.twimg.com/media/FjI8sOPacAAilEE.jpg|13295|
|https://pbs.twimg.com/media/FktSKyxXkAEvFyV.jpg|11956|
|https://pbs.twimg.com/media/FmVKpX9aMAENckX.jpg|11000|
+-----------------------------------------------+-----+
Using saluting_face-text.csv
from df.text.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("saluting_face-text")
, Polars
, and j-hartmann/emotion-english-distilroberta-base
:
With any luck, this post will show up sometime on the 12/23 and I’ll have successfully figured out how to schedule posts :).
First — I hope to everyone that celebrates, they are having a very happiest of holidays. If weather, travel, etc. goes well, we’ll be celebrating with family and enjoying a little bit of downtime with people that we love and care about. I hope that however you chose to spend the time, it’s merry.
On to MarcEdit. A couple of update notes. First, related to the MAC version of the application. After updating MarcEdit 7.7 to the new .NET core framework (.NET Core 8.0 LTS) — I switched to my mac to make changes and recompile the code, targeting the new framework…and I was in for a surprise. I had not been paying attention, but Microsoft has essentially ended development using Visual Studio on Mac systems, and has transitioned to using Visual Code. That isn’t so much a big deal — Visual Code has a great extension and can easily be used to write C# applications. The bigger issue was that the previous UI methodology I was using wasn’t being ported forward to support .NET Core 8. Since all of the current application now uses that framework — that is a problem. That leaves me with a couple of options:
I have decided to look for another option — because while there aren’t a lot of Mac downloads, there is enough that I’d like to keep the option open. So, I did a little research and testing, and for now, what I’ve done is created a process where you can run MarcEdit 7.7 via Wine on a Mac. Essentially, to make this work, I’ve done the following:
The results can be seen here:
https://youtu.be/jgkYJEfcZY0?si=zil7j5fZhoeJXyUq
I’ve written up some instructions — you will find them on the download page with the mac option. I will provide a link to bother previous version and the new version — but at this point, all new development will happen on the version being wrapped for use with Wine. Long-term, I likely will be recoding the application to utilize a framework like MAUI or UNO so I can write one interface that will compile natively for multiple systems — but that is likely a 8-12 month project and right now, both only the UNO framework is complete enough to make the transition work. So, while I evaluation frameworks, I’ll be moving all business code into a new code library — MarcEdit.Essentials — which should simplify the move at a later date.
Changes
Updates to this version of the application are as following:
That’s pretty much it. Downloads have been posted, updates should prompt, the org version has been refreshed.
Again, have a happy holiday!
–tr
Duke’s Public Domain Day 2025 post discusses René Magritte‘s pipe painting, “La Trahison des Images”, and the difficulty of determining whether US copyright law considers it “published” in 1929, and therefore public domain in 9 days.
A related work we know meets that criterion is his illustrated essay “Les Mots et Les Images”, showing distinctions between words, images, and objects, which his painting also expresses. It’s in the last issue of La Révolution Surréaliste. #PublicDomainDayCountdown
How often do you interact with data in your daily tasks? For many of us, working with data has become a regular part of our professional and personal lives. But, do you have the coding skills needed to clean and prepare data for deeper analysis?
When asked these questions during our session at the Global Voices Summit 2024 – a global gathering of digital media, knowledge, and activism leaders – the responses were revealing. While half the participants acknowledged working with data daily, only a small fraction raised their hands when asked about coding skills. This reflects a widespread reality: Many people work with data, but few have the tools or expertise to handle it efficiently.
Recognising these challenges, we introduced the Open Data Editor (ODE) during the Summit’s demo session. Designed as a practical tool for non-technical users, ODE is a desktop application that simplifies error detection, editing, and publishing for tabular data. In this demo session, we explored best practices for working with data and demonstrated how ODE can help users identify and correct common dataset errors, making collaboration and sharing easier. Whether you’re a journalist, public official, or researcher, ODE enables you to produce high-quality, reusable datasets without the need for coding expertise.
Because non-technical users often encounter significant difficulties when handling large datasets:
The ODE, powered by the Frictionless toolkit, addresses common challenges in data handling with a suite of powerful features. For instance, ODE’s simplified error detection automatically identifies issues such as missing headers, duplicate columns, and incorrect data types, ensuring that errors are promptly flagged. Its visual editing tools further enhance usability, allowing users to explore highlighted errors and correct them directly within the application, making the process both intuitive and efficient. Additionally, ODE excels in metadata management, enabling users to edit and maintain structured metadata, ensuring datasets remain clean and well-documented for future use.
ODE empowers individuals and organisations to unlock the full potential of their data. By simplifying complex workflows, it allows users to streamline data handling, enabling them to focus on deriving meaningful insights rather than being hindered by technical challenges. Moreover, ODE significantly enhances collaboration by facilitating the creation of clean, shareable datasets. This ensures teams can work together more effectively, aligning efforts and maintaining consistency. Its robust error detection and correction capabilities also increase accuracy, reduce mistakes, and improve the reliability of data-driven decisions.
In essence, the Open Data Editor enables users to work smarter, not harder, revolutionising how they manage and leverage their data.
Participants shared insightful feedback at the Global Voices Summit 2024 to further enhance ODE. One key suggestion was to introduce a more robust error-reporting mechanism, allowing users to flag and document errors more effectively. This improvement would enhance transparency and significantly improve the user experience. Additionally, participants emphasised the integration of AI capabilities as a transformative feature, given the growing reliance on AI. Such capabilities could enable smarter recommendations for error correction and further streamline workflows.
Another area of interest was the availability of a web-based version of ODE. A cloud-based solution would provide greater accessibility and ease of use across devices. For instance, some participants highlighted that their organisations do not permit the installation of new applications on company computers and devices, making a web-based option essential.
Lastly, participants stressed the need for a simpler pitch to communicate ODE’s value more clearly and effectively. A concise and compelling message would help ensure that its benefits resonate with a broader audience. These suggestions underscore the ongoing commitment to making ODE an even more powerful, accessible, and user-friendly tool for data practitioners.
Looking to make data management easier and more accessible? The Open Data Editor is here to help. Download the latest version today and experience its transformative features firsthand!
As the year ends, we always like to take a moment to reflect on the past 12 months and what we’ve learned. A favourite part of this is sharing the most popular newsletter links, which inspire our team and, hopefully, our readers too. And while admittedly our own blog was a little neglected this year, [...]
Hyperlinks are the essence of the web. They enable content discovery, allowing users to navigate between diverse sources of information with different interfaces, graphics, and technologies. Using links is straightforward - you just need to click or tap on them. It's also easy to create new links on the web, you just need to follow some basic rules and conventions.
Lately, there has been a renaissance of linkblogs, blogs focused on sharing curated links. Some notable examples of linkblogs I follow: Simon Willison's Links, Nelson's Linkblog, Kellan's Linkblog. Things Magazine is also a linkblog, as is the wonderful Italian newsletter Link Molto Belli == Very Beautiful Links. My RSS reader follows also many accounts from Pinboard, they are technically personal bookmarks, but I consider them equivalent to linkblogs too.
I've been thinking about creating my own linkblog for a while now. I browse the web and save many bookmarks, some are private, but most can be public. They reflect a curated filter of web content about topics related to my interests (digital libraries and archives, web archiving, books, mountains, and obscure music).
I discarded the idea of using any blogging service, I prefer to self-host my content (like this blog). I could have used a static site generator - there are dozens of them - but I always struggle to find one as simple as I want. Furthermore, since a linkblog is related to content I browse, I want something with less friction than creating a markdown file, pushing to a repo, and waiting for the build. I want an admin interface where I can post quickly, without leaving the browser, maybe with the help of a bookmarklet, a browser extension, or a Tampermonkey script.
I have also evaluated Pocketbase, which is a very nice application platform. You can easily create a data model (with migrations), the UI is minimalistic and beautiful, and you can easily plug in code (this was my first attempt that just publishes an RSS feed, linkbase). It's very easy and powerful, but some things are missing: an HTML interface (which could be quickly done with templ), but more importantly, Fediverse integration. Because yes, for a linkblog an RSS/Atom feed is mandatory, but these days ActivityPub is also a good way to publish content and reach readers.
A full Mastodon instance is overkill, considering the resources and maintenance required. I want something simpler. Here comes Snac, a simple, minimalistic ActivityPub instance written in portable C. A database is not needed, the data is stored in json files in the filesystem, dependencies are minimal, and there is no Javascript.
I first heard of Snac from Stefano Marinelli, who is a lovely source of news from the BSD world, selfhosting, networking and everything related to Unix philosophy. Then from Giacomo Tesio and this good post How to run your own social network (with Snac).
So, this is my linkblog on fediverse, made with Snac: https://href.literarymachin.es/raffaele.
This is how I have installed it. I prefer a containerized deploy, but a static binary build and a systemd service are enough and maybe even more simpler to deploy it.
I build the image on my laptop:
git clone https://codeberg.org/grunfink/snac2.git
cd snac2
docker build -t snac .
Then I transfer the image on the remote server (it's a 12MB image, I don't need a registry!):
docker image save snac | ssh {REMOTE_SERVER} docker load
I run it with this Docker compose:
services:
href:
image: snac
restart: always
security_opt:
- no-new-privileges:true
volumes:
- ./data:/data
ports:
- "8001:8001"
mkdir data
docker compose up -d
The basic configuration needed is changing the hostname:
cat data/data/server.json | jq .host
"href.literarymachin.es"
Then I create my user
docker compose exec href snac adduser /data/data raffaele
And finally, I have configured a nginx proxy like this example.
Follow my linkblog, and suggest more linkblogs to follow!
The first musical Cole Porter and Herbert Fields wrote together was 1929’s Fifty Million Frenchmen, whose title alludes to a 1927 song not actually used in the show. The most memorable song it does use is Porter’s “You Do Something to Me”, in which the two main characters confess their mutual beguilement. Among the song’s many covers, I’m personally fond of Sinéad O’Connor’s, which my spouse and I danced to on our wedding day. Song and show go public domain in 10 days. #PublicDomainDayCountdown
This is a dataset that I hydrated (April of 2023), and was created by Zachary Maiorana, Pablo Morales Henry, and Jennifer Weintraub. It is the “#metoo Digital Media Collection - Hashtag: timesup”, which is part of the the Schlesinger Library #timesup Digital Media Collection. The original dataset contains 3,720,729 Tweet IDs, and I was able to hydrate 2,518,092 tweets. Giving me a Hydration Rate of 67.68%. The hydrated dataset covers from October 15, 2017 through June 1, 2020.
Using the full line-oriented JSON dataset and twut
:
import io.archivesunleashed._
val tweets = "timesup.jsonl"
val df = spark.read.json(tweets)
df.language
.groupBy("lang")
.count()
.orderBy(col("count").desc)
.show(10)
+----+-------+
|lang| count|
+----+-------+
| en|2209208|
| qme| 112478|
| es| 51085|
| und| 23017|
| ja| 22439|
| qht| 15806|
| pt| 15539|
| fr| 14892|
| ko| 8150|
| th| 7134|
+----+-------+
Using timesup-user-info.csv
from df.userInfo.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("timesup-user-info")
, and pandas
:
Using the full line-oriented JSON dataset and twut
:
import io.archivesunleashed._
val tweets = "timesup.jsonl"
val df = spark.read.json(tweets)
df.mostRetweeted.show(10, false)
+-------------------+-----------------+
|tweet_id |max_retweet_count|
+-------------------+-----------------+
|950562797183799296 |26883 |
|971801461024686080 |21628 |
|950137819808305152 |19291 |
|950215871200456705 |17209 |
|956944474596196352 |12927 |
|958067014597300224 |12648 |
|950211856630628353 |11741 |
|950140126591504384 |11672 |
|1034341408361078784|10176 |
|950564161913868293 |10161 |
+-------------------+-----------------+
From there, we can use append the tweet ID to https://twitter.com/i/status/ to see the tweet. Here’s the top three:
26,883
Spoiler Alert: it was about your father. https://t.co/AyUNRZICjI
— Keith Olbermann (@KeithOlbermann) January 9, 2018
21,628
It’s #InternationalWomensDay. There was a time not long ago that women couldn’t vote, or open credit cards without their husband’s signature, or compete in the Olympics, or do their jobs without being harassed. That time is up. #TimesUp
— The Ellen Show (@EllenDeGeneres) March 8, 2018
19,291
you’re literally in the new woody allen movie https://t.co/5jAElrqeFz
— laura j. brown (@laurjbrown) January 7, 2018
Using timesup-hashtags.csv
from df.hashtags.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("timesup-hashtags")
, and pandas
:
Using the full line-oriented JSON dataset and twut
:
import io.archivesunleashed._
val tweets = "timesup.jsonl"
val df = spark.read.json(tweets)
df.urls
.groupBy("url")
.count()
.orderBy(col("count").desc)
.show(10, false)
+----------------------------------------------------------+-----+
|url |count|
+----------------------------------------------------------+-----+
|https://twitter.com/ivankatrump/status/950561402053447685 |46826|
|https://t.co/AyUNRZICjI |20294|
|https://twitter.com/jtimberlake/status/950136611391471616 |13988|
|https://t.co/5jAElrqeFz |12572|
|https://twitter.com/IvankaTrump/status/950561402053447685 |8287 |
|https://t.co/si9E5MPw8I |7801 |
|http://www.karlyletomms.com/single-post/2018/01/07/MeToo |5459 |
|https://t.co/OmJCYUkD6W |5298 |
|https://twitter.com/GeeksOfColor/status/950165381607391232|4534 |
|https://t.co/3TMsQFmboS |4455 |
+----------------------------------------------------------+-----+
Using the full line-oriented JSON dataset and twut
:
import io.archivesunleashed._
val tweets = "timesup.jsonl"
val df = spark.read.json(tweets)
df.mediaUrls
.filter(col("media_url").isNotNull)
.groupBy("media_url")
.count()
.orderBy(col("count").desc)
.show(10, false)
+------------------------------------------------------------------------------------------+-----+
|media_url |count|
+------------------------------------------------------------------------------------------+-----+
|https://pbs.twimg.com/media/DS-RFRlVwAAvCJr.jpg |8463 |
|https://pbs.twimg.com/media/DTDwIDtVQAEPq0t.jpg |6749 |
|https://pbs.twimg.com/media/DTDgvL6VMAEJz-Y.jpg |3793 |
|https://video.twimg.com/amplify_video/950227596121288705/vid/640x360/60Eys_8CYjy8RiIh.mp4 |3327 |
|https://video.twimg.com/amplify_video/950227596121288705/pl/mTRKmGFdrLZ1g2l4.m3u8 |3327 |
|https://video.twimg.com/amplify_video/950227596121288705/vid/1280x720/OK_RCHIfzkFpFwHj.mp4|3327 |
|https://video.twimg.com/amplify_video/950227596121288705/vid/320x180/uVgPuVP2vwHFRYkd.mp4 |3327 |
|https://pbs.twimg.com/media/DS-QNbPU0AAS5OT.jpg |3310 |
|https://pbs.twimg.com/media/DS-JKK3UQAEOY0X.jpg |2700 |
|https://pbs.twimg.com/media/DS5Lz7tVoAEjmAX.jpg |2322 |
+------------------------------------------------------------------------------------------+-----+
A couple years ago I created a juxta (collage) of the images from this dataset. It features 298,158 images, and you can check it out here.
Using timesup-text.csv
from df.text.coalesce(1).write.format("csv").option("header", "true").option("escape", "\"").option("encoding", "utf-8").save("timesup-text")
, Polars
, and j-hartmann/emotion-english-distilroberta-base
: