Planet Code4Lib

1982 / CrossRef

I've been trying to capture what I remember about the early days of library automation. Mostly my memory is about fun discoveries in my particular area (processing MARC records into the online catalog). I did run into an offprint of some articles in ITAL from 1982 (*) which provide very specific information about the technical environment, and I thought some folks might find that interesting. This refers to the University of California MELVYL union catalog, which at the time had about 800,000 records.

Operating system: IBM 360/370
Programming language: PL/I
CPU: 24 megabytes of memory
Storage: 22 disk drives, ~ 10 gigabytes

The disk drives were each about the size of an industrial washing machine. In fact, we referred to the room that held them as "the laundromat."

Telecommunications was a big deal because there was no telecommunications network linking the libraries of the University of California. There wasn't even one connecting the campuses at all. The article talks about the various possibilities, from an X.25 network to the new TCP/IP protocol that allows "internetwork communication." The first network was a set of dedicated lines leased from the phone company that could transmit 120 characters per second (character = byte) to about 8 ASCII terminals at each campus over a 9600 baud line. There was a hope to be able to double the number of terminals.

In the speculation about the future, there was doubt that it would be possible to open up the library system to folks outside of the UC campuses, much less internationally. (MELVYL was one of the early libraries to be open access worldwide over the Internet, just a few years later.) It was also thought that libraries would charge other libraries to view their catalogs, kind of like an inter-library loan.

And for anyone who has an interest in Z39.50, one section of the article by David Shaughnessy and Clifford Lynch on telecommunications outlines a need for catalog-to-catalog communication which sounds very much like the first glimmer of that protocol.


(*) Various authors in a special edition: (1982). In-Depth: University of California MELVYL. Information Technology and Libraries, 1(4)

I wish I could give a better citation but my offprint does not have page numbers and I can't find this indexed anywhere. (Cue here the usual irony that libraries are terrible at preserving their own story.)

Open mapping minibus routes in South Africa: Open Data Day 2020 report / Open Knowledge Foundation

On Saturday 7th March 2020, the tenth Open Data Day took place with people around the world organising over 300 events to celebrate, promote and spread the use of open data. Thanks to generous support from key funders, the Open Knowledge Foundation was able to support the running of more than 60 of these events via our mini-grants scheme

This blogpost is a report from the University of Pretoria in South Africa who received funding from Mapbox to develop a complete map of minibus taxi routes in Mamelodi East with the local knowledge of school learners.

Our Open Data Day event took place on 7th March 2020 at the Department of Geography, Geoinformatics and Meteorology at the University of Pretoria, Hatfield campus.

We celebrated the day by hosting 61 Gr 12 geography learners from Dr WF Nkomo High School in Atteridgeville. 

Open Data Day is an annual international event that promotes the awareness and use of open data, and our event focused on open mapping. The aim of our event was to map minibus taxi routes in and around Atteridgeville. This was done in two phases, firstly using an approach called participatory GIS and then using QGIS. Participatory GIS focuses on using local knowledge to collect geospatial data in a community. In small groups, the learners were guided by a student to map the routes and stops using markers and stickers on a A3 aerial image with main routes and points of interest indicated. The learners presented their maps and one of the main observations that the learners make was the lack of information available in their area. The students then explained to them how OpenStreetMap can be used to map their community and the importance of this.

During the second part of the day, the learners were introduced to QGIS (an open source Desktop GIS application). They used QGIS to digitise the routes and stops indicated on their paper maps, to create a digital map of the minibus taxi network in Atteridgeville.

Thank you to all the lecturers, students and learners that made this day a huge success. Most importantly, thank you to our sponsors, the NRF Community Engagement Grant, the Open Knowledge Foundation and Mapbox.

We are back on Twitter tomorrow for #LITAchat / LITA

Are you ready for the next Twitter #LITAchat? Join the discussion on Friday, May 22, from 12-1pm Central Time. We will be asking you to tell us about challenges with working from home. Are there things you can’t do and wish you could? Are there issues with your home setup in general? Anne Pepitone will lead the discussion.

We invite you to join us tomorrow to share your experiences and chat with your colleagues.

Follow LITA on Twitter

Catch up on the last #LITAchat

We’re looking forward to hearing from you!

-The LITA Membership Development Committee

Raising awareness of budget information in Uganda: Open Data Day 2020 report / Open Knowledge Foundation

On Saturday 7th March 2020, the tenth Open Data Day took place with people around the world organising over 300 events to celebrate, promote and spread the use of open data. Thanks to generous support from key funders, the Open Knowledge Foundation was able to support the running of more than 60 of these events via our mini-grants scheme

This blogpost is a report by Robert Kibaya from the Kikandwa Rural Communities Development Organisation in Uganda who received funding from Datopian to showcase the Uganda Budget Information website and how to use it to report, track and monitor public funds.

The Uganda Budget Information website promotes transparency and accountability in the use of public funds by allowing any person to access and give feedback on national and local government budgets and performance.

This tool enables any person to access budget information of how resources are allocated and utilised down to the parish level. The tool details information on plans and performances as well as financial details. The tool allows users to provide feedback on service delivery in their local area. 

Unfortunately the public is not generally aware of the existence of this tool and our workshop will sensitise people on how to use the tool.

So the main goal of our Open Data Day event was to sensitise the participants about the existence of the Uganda Budget Information website and how to use it to report, track and monitor public funds.

Our workshop took place on 11th March 2020 at Jobiah Hotel in Mukono Municipality and it was attended by 22 selected participants from Civil Society Organizations (CSOs) operating in Mukono District.

Before introducing the participants to the Budget Monitoring Tool, I lead a session on the basics of the budget process highlighting the threemost significant stages:

  1. The Budget Framework, 
  2. The Draft Budget and 
  3. The Approved Budget.

The participants were introduced to the Uganda Budget Information website and instructed to select their respective Local Governments and view all their publications to examine available budget and expenditure data.

After receiving the training, participants pledged to sensitise others and provide basic training on how to access information from the Budget Information website.

LITA Job Board Analysis Report – Laura Costello (Chair, Assessment & Research) LITA Assessment & Research and Diversity & Inclusion Committees / LITA

Background & Data

This report comes from a joint analysis conducted by LITA’s Assessment & Research and Diversity & Inclusion committees in Fall 2019. The analysis focused on the new and emerging trends in skills in library technology jobs and the types of positions that are currently in demand. It also touches on trends in diversity and inclusion in job postings and best practices for writing job ads that attract a diverse and talented candidate pool. 

The committees were provided with a list of 678 job postings from the LITA job board between 2015-2019. Data included the employer information, the position title, the location (city/state) the posting date. Some postings also included a short description. The Assessment & Research Committee augmented the dataset with job description, responsibilities, qualifications, and salary information for a 25% sample of the postings from each year using archival job posting information. Committee members also assigned metadata for the type of position and indicated the presence or absence of salary information in the posting.

Literature Review

The dataset analyzed for this project is aimed at job postings in library technology. There are several examples in the literature of job advertisement analyses that focus on library technology skills and the particular requirements and skillsets required for these types of positions. Despite the focus on technology skills, examples from the literature still show that a Master of Library and Information Science (MLIS) or equivalent degree is required between 79.3% and 90.1% of the time (Choi & Rasmussen, 2009; Triumph & Beile, 2015).

As in the sample examined in this study, many library jobs have a strong technology component. Triumph and Beile (2015) found that computer skills were explicitly stated and required in all but 17.2% of positions. Additionally, Choi & Rasmussen (2009) found that experience with digital library/digital information systems or services and web development/design experience were the most sought-after skills in academic library technology job postings from 1999-2007. An analysis of the jobs posted to Code4Lib from 2008-2018 (Gonzales, 2019) found changes in the technology skills required over that time period, including an increase in demand for Python, XML, and Java.

The presence of salary information in library job postings has also been examined in the literature. Triumph and Beile (2015) found that only 35.2% postings in their sample listed salary information. Silva and Galbraith (2018) found that women and librarians with less experience were less likely to engage in salary negotiation and recommend clarity in salary information in job postings for greater equity.

Job Trend Analysis

The LITA ARC examined several aspects of jobs posted to the LITA job board including information about the pool of employers and the type of employment sought. The committee also looked at the types of skills and qualifications sought in job postings and the presence of salary information. There were 395 unique employers in the dataset and 71% of these were unique. 11% of employers returned to post multiple jobs in the same year while 18% returned in different years to post jobs. In the sample of jobs coded by ARC (n=172), most of the postings were for librarian positions (45.6%) followed by library technology (22.7%), administrative or director (19.2%), staff (11.6%) and teaching faculty (0.6%). An analysis of job title keywords from the sample reflected this breakdown and revealed mostly simple descriptive job title language. A word cloud of the results below shows “librarian” in the top position. Terms like “library,” “technology,” “director,” “services,” and “digital” were also popular in job titles.

Figure 1: Word Cloud of Job Title Keywords

The sample was a mix of library and technology jobs, so the percentage of the jobs posted to the LITA job board that required an MLIS was lower than what was observed in the literature. Only 62.2% of jobs required an MLIS overall. The figure below shows a pie chart of the overall percentage and a breakdown of required MLIS by job type: 94.9% of the positions coded as librarian required the MLIS, 78.8% of admin/director positions, 12.8% of technology positions, and 5% of staff positions.

Figure 2: Pie Chart Showing Percentage of positions with MLIS Required

The ARC also examined 70 complete postings from 2019 to understand the trends in job posting language. A word count analysis was conducted on the full text of the job descriptions, skills and duties, and qualifications. Data was cleaned to combine similar terms and clarify usage of ambiguous terms. Though only 19% of the jobs in this sample were categorized as administrators or directors, leadership, management, and supervisory skills were the most frequently mentioned skills in 2019 (436 instances). Communication, collaboration, and teamwork skills were also highly sought after (328 instances) followed by planning and strategic skills (152 instances). Technology skills frequently mentioned included development (116 instances), general digital and technology skills (331 instances), and software/hardware administration and maintenance (176 instances). Sought after library skills included information/research (370 instances), reference (41 instances), data (115 instances), collections (120 instances), cataloging/metadata (79 instances), instruction/teaching (120 instances), and scholarship or scholarly communications (72 instances). In addition, these specific technology tool skills were frequently mentioned:

  • Web, websites: 116 instances
  • Discovery: 43 instances
  • Databases: 32 instances
  • Repository: 31 instances
  • Statistics: 19 instances
  • Server: 17 instances
  • ILS: 13 instances
  • Proxy: 13 instances
  • Primo: 12 instances
  • php: 12 instances
  • Digitization: 12 instances
  • Alma: 12 instances
  • Python: 11 instances
  • HathiTrust: 11 instances

Salary information was posted for 39% of the positions that had complete information (n=223). This percentage is higher than the figure found by Triumph and Beile in their 2015 analysis, which could mean a positive trend in salary posting information. The highest listed annual salary was $233,000 and the lowest listed was $33,880. The average salary range was $67,331-$89,282.

Diversity & Inclusion Best Practices

The Diversity & Inclusion Committee analyzed a sample of the augmented job posting information and developed several recommendations for position posting.

Recommendations for Job Descriptions:

Best Practices:

Job Requirements:

  • Regularly revise or rewrite job descriptions to ensure that job requirements are clear and focused on the results of an activity rather than standardized requirements (Hire for Talent)
  • Avoid posting requirements that are nonessential and may disqualify candidates who are otherwise qualified for the position (Hire for Talent)
  • Clearly indicate the physical working conditions and hours of work (Hire for Talent)
  • Exclude educational requirements if they are not necessary for success in the position (Equity, Diversity and Inclusion in Recruitment, Hiring and Retention)


Choi, Y., & Rasmussen, E. (2009). What qualifications and skills are important for digital librarian positions in academic libraries? A job advertisement analysis. The Journal of Academic Librarianship, 35(5), 457-467.

Gonzales, B. M.  (2019). Computer programming for librarians: A study of job postings for library technologists. Journal of Web Librarianship, 13(1), 20-36.

Mathews, J. M., & Pardue, H. (2009). The Presence of IT skill sets in librarian position announcements. College & Research Libraries, 70(3), 250-257.

Silva, E., & Galbraith, Q. (2018). Salary negotiation patterns between women and men in academic libraries. College & Research Libraries, 79(3), 324.

Triumph, T. F., & Beile, P. M. (2015). The trending academic library job market: An analysis of library position announcements from 2011 with comparisons to 1996 and 1988. College & Research Libraries, 76(6), 716-739.

Yang, Q., Zhang, X., Du, X., Bielefield, A., & Liu, Y. (2016). Current market demand for core competencies of librarianship—A text mining study of American Library Association’s advertisements from 2009 through 2014. Applied Sciences, 6(2), 48.

Additional Resources



The Pennsylvania Integrated Library System (PaILS) seeks a dynamic Executive Director with strong leadership and communication skills, strategic vision, and a track record of building strong customer relationships and delivering customer-centric solutions. The Executive Director is the chief operating officer of PaILS, a non-profit corporation, and is charged with implementing the organization’s policies and programs, exercising administrative oversight of all the affairs and finances of the consortium, overseeing the technical operation of the Pennsylvania library open-source software consortium, known as SPARK, and developing and implementing long-term organizational and strategic goals. Currently, PaILS has two FTEs and is supported through a combination of member fees and Library Services and Technology Act (LSTA) funds provided by PA’s Office of Commonwealth Libraries (OCL).

SPARK is a comprehensive, remotely hosted, “one-price” solution for over 150 libraries in Pennsylvania. The software includes circulation, cataloging, serials, and acquisitions functionality. The online library catalog for library end-users offers the enhanced functionality found in the most expensive commercial discovery layers. SPARK utilizes the Evergreen open-source integrated library system (ILS). Evergreen is currently available in over 1,000 libraries, both in the US and internationally, and has an active and growing community of developers and users.


The Executive Director provides overall leadership for PaILS and reports to a nine-member volunteer board of directors. The successful candidate will plan and manage the operations of the SPARK system to ensure system reliability and high-quality functioning for member libraries and other stakeholders. The Executive Director is also responsible for building relationships, and developing and implementing strategies for collaboration with the various PaILS stakeholders to ensure the effective operation of SPARK for Pennsylvania libraries. Key stakeholders include SPARK member libraries, the larger Evergreen open source community, key technology partners, and other library partner organizations within the Commonwealth. The new Executive Director will lead system staff in fostering an environment of collaboration, trust, and transparency that propels continual development of the SPARK system. Day-to-day administrative oversight of PaILS operations and grant writing are also key responsibilities.


While the ideal candidate may not possess all of the qualities listed below, s/he will possess many of the following professional and personal abilities, attributes, and experiences:

· Master’s Degree in Library Science (MLS) from an ALA-accredited institution (preferred) or equal experience and advanced education;

· Experience in business administration or a strong foundational knowledge of the principles of business administration, financial planning, and organizational management, with a minimum of five years of experience as a proven leader and administrator including a demonstration of:

· excellent written and verbal communication skills;

· ability to collaborate with a board of directors to achieve an organizational mission and implementation of successful development practices;

· ability to work with diverse stakeholders to achieve the organizational mission and goals, including proactive community relations, outreach, and advocacy;

· administrative and supervisory experience, along with experience in the management and leadership of member committees and cross-functional teams; and

· federal grant-writing experience.

· Strong knowledge of and technology management experience ideally with integrated library systems or technology systems designed for information management and consortial collaboration;

· Experience in building relationships with customers such as libraries, library organizations and consortia, state and local government officials and legislators;

· Experience in managing RFI and RFQ processes, product evaluations, and developing service level agreements and negotiating costs with vendor partners;

· Experience in conducting presentations and exhibits at conferences;

· Residence in Pennsylvania or willingness to become a Pennsylvania resident. May work from home.

· Willingness and ability to travel extensively throughout Pennsylvania.

SALARY: Minimum salary of $75,000 annually and a generous benefit’s package.


Interested individuals who meet these criteria should submit a single email containing a resume, a cover letter, and three professional references in PDF format to no later than June 1, 2020. Further details about specific job responsibilities or questions may also be sent to the same email address. Review of applications and interviews will begin immediately and will be ongoing until the position is filled.

The future of Big Te(a)ch / Mita Williams

Last week, my place of work announced that the university campus was going to be primarily online for the upcoming fall semester. From my understanding, the qualifier of primarily is being used because there are some professional programs that have compulsory in-person components such as in clinical nursing.

Replicating hands-on or lab components of classes are a particular challenge in the present moment. How do you replace what an anatomy class might mean to a medical student? When you are training students to do work in a chemistry lab, what do you do when you no longer have a lab to work in?

I have taken my fair share of lab courses and, to be honest, I recall many of them were stressful. I always felt the pressure of being on the clock and having to finish a series of steps towards an outcome that was unclear to me. To be honest, young me would have preferred the option of watching a lab instructor with a go-pro strapped to their forehead, going through the experiment on my behalf.

But watching another person complete a jigsaw puzzle is not the same as doing the jigsaw puzzle yourself.

How can we create rich, online or at-home experiences with choice and agency?

One answer is, The Future of Big Tech.

It’s not the future you think I mean. I’m referring to the 10 minute game The Future of Big Tech which available as pay-as-you-can from Coney’s Pop-Up Playhouse [from the menu, click on : 2+ Players > The Future of Big Tech]

Coney is a UK-based interactive theatre group whose work I’ve been casually following for some years now. I’ve only recently started exploring their online options. This past weekend, I played Big Tech Future with my kids and I really appreciated the opportunity to have a conversation of what the experience meant to them afterwards.

screenshot from The Future of Big Tech

I’m being vague here because I really don’t want to spoil the experience as it is one that you really should try. But if you are feeling apprehensive about putting on your headphones and diving in, I will tell you a bit of what you can expect.

Once you choose your character, you will hear a short description of who you are and how you live in a particular future. You will pick up a phone call and during the call, you will be given choices to make. There are no loud or sudden disturbing noises during the call and the game ends in under ten minutes.

The voice acting is very good. I’m adding it as evidence in my ‘augmented experiences are better than virtual ones‘ file.

I’m so impressed how much this game achieves in such a short time. I also appreciate that the designers recognized that by dividing the experience into two, the game creates an easy entry into conversation afterwards, as each participant will want to ask the other for their side of the story.

It truly belongs on a syllabi.

DLF Forum and Fall Events Move Online / Digital Library Federation

2020 Forum & Affiliated Events

Based on the overwhelming responses to our community survey, the number and distribution of proposals for all CFPs, and CLIR’s ongoing monitoring of the pandemic situation, it has become clear to us that for the health and safety of our attendees and presenters, we must transition all of our fall events to a virtual format. We are sorry we won’t be able to explore Baltimore in 2020, but we’re already making arrangements to hold our 2022 events there, so we’ll just have to wait a bit longer to enjoy time together in Charm City.

What does this mean for 2020? We are not entirely sure yet, but we would love your input. We are happy to share this second community survey about virtual events in which we ask for your thoughts on what you’d like to see from a virtual CLIR/DLF event or series of events this fall. We would appreciate your input by Monday, June 1.

We understand that you may have questions, and while we may not have the answers just yet, we welcome the dialogue with you. Please don’t hesitate to reach out to us at, and thank you so much for your understanding and patience during this unprecedented time. Stay healthy and safe, and we’ll be in touch with more soon.

The post DLF Forum and Fall Events Move Online appeared first on DLF.

Congratulations to Dr. Jian Qin, winner of the 2020 LITA/OCLC Kilgour Research Award / LITA

Dr. Jian Qin has been selected as the recipient of the 2020 Frederick G. Kilgour Award for Research in Library and Information Technology, sponsored by OCLC and the Library and Information Technology Association (LITA). She is the Professor and Director at the iSchool, Syracuse University. 
The Kilgour Award honors research relevant to the development of information technologies, especially work which shows promise of having a positive and substantive impact on any aspect(s) of the publication, storage, retrieval and dissemination of information, or the processes by which information and data are manipulated and managed. It recognizes a body of work probably spanning years, if not the majority of a career. The winner receives $2,000, and a citation.

Dr. Qin’s recent research projects include metadata modeling for gravitational wave research data management and big metadata analytics using GenBank metadata records for DNA sequences, both with funding from NSF. She also collaborated with a colleague to develop a Capability Maturity Model for Research Data Management funded by a grant from the Interuniversity Consortium for Political and Social Research (ICPSR). She was a visiting scholar at the Online Computer Library Center (OCLC), where she developed the learning object vocabulary project. She has published widely in national and international research journals. Dr. Qin was the co-author of the book Metadata and co-editor for several special journal issues on knowledge discovery in databases and knowledge representation.
“Dr. Qin has made immeasurable contributions to the field of metadata throughout her 20 year career, including writing Metadata, now in its second edition. This title is award winning and a core text in the field.  She has shown that it is possible to bridge the gap between the theoretical and the practical, not always an easy undertaking.”
When notified that she had been selected, Dr. Qin said, “I have always been awed by Kilgour’s legacy and feel tremendously honored to be selected for this award. I am grateful and humbled by this recognition from the LIS community.”
Members of the 2020 Kilgour Award Committee are Emma Kepron (Chair), Aimee Fifarek (Past Chair), David Ratledge, Colby Riggs, and Andrew Pace (OCLC Liaison).

About LITA

The Library and Information Technology Association (LITA) is the leading organization reaching out across types of libraries to provide education and services for a broad membership of systems librarians, library technologists, library administrators, library schools, vendors, and others interested in leading edge technology and applications for librarians and information providers. LITA is a division of the American Library Association. Follow us on our Blog, Facebook, or Twitter.

About OCLC

OCLC is a nonprofit global library cooperative providing shared technology services, original research and community programs so that libraries can better fuel learning, research and innovation. Through OCLC, member libraries cooperatively produce and maintain WorldCat, the most comprehensive global network of data about library collections and services. Libraries gain efficiencies through OCLC’s WorldShare, a complete set of library management applications and services built on an open, cloud-based platform. It is through collaboration and sharing of the world’s collected knowledge that libraries can help people find answers they need to solve problems. Together as OCLC, member libraries, staff and partners make breakthroughs possible.

The Death Of Corporate Research Labs / David Rosenthal

In American innovation through the ages, Jamie Powell wrote:
who hasn’t finished a non-fiction book and thought “Gee, that could have been half the length and just as informative. If that.”

Yet every now and then you read something that provokes the exact opposite feeling. Where all you can do after reading a tweet, or an article, is type the subject into Google and hope there’s more material out there waiting to be read.

So it was with Alphaville this Tuesday afternoon reading a research paper from last year entitled The changing structure of American innovation: Some cautionary remarks for economic growth by Arora, Belenzon, Patacconi and Suh (h/t to KPMG’s Ben Southwood, who highlighted it on Twitter).

The exhaustive work of the Duke University and UEA academics traces the roots of American academia through the golden age of corporate-driven research, which roughly encompasses the postwar period up to Ronald Reagan’s presidency, before its steady decline up to the present day.
Arora et al argue that a cause of the decline in productivity is that:
The past three decades have been marked by a growing division of labor between universities focusing on research and large corporations focusing on development. Knowledge produced by universities is not often in a form that can be readily digested and turned into new goods and services. Small firms and university technology transfer offices cannot fully substitute for corporate research, which had integrated multiple disciplines at the scale required to solve significant technical problems.
As someone with many friends who worked at the legendary corporate research labs of the past, including Bell Labs and Xerox PARC, and who myself worked at Sun Microsystems' research lab, this is personal. Below the fold I add my 2c-worth to Arora et al's extraordinarily interesting article.

The authors provide a must-read, detailed history of the rise and fall of corporate research labs. I lived through their golden age; a year before I was born the transistor was invented at Bell Labs:
The first working device to be built was a point-contact transistor invented in 1947 by American physicists John Bardeen and Walter Brattain while working under William Shockley at Bell Labs. They shared the 1956 Nobel Prize in Physics for their achievement.[2] The most widely used transistor is the MOSFET (metal–oxide–semiconductor field-effect transistor), also known as the MOS transistor, which was invented by Egyptian engineer Mohamed Atalla with Korean engineer Dawon Kahng at Bell Labs in 1959.[3][4][5] The MOSFET was the first truly compact transistor that could be miniaturised and mass-produced for a wide range of uses.[6]
Arora et al Fig 2.
Before I was 50 Bell Labs had been euthanized as part of the general massacre of labs:
Bell Labs had been separated from its parent company AT&T and placed under Lucent in 1996; Xerox PARC had also been spun off into a separate company in 2002. Others had been downsized: IBM under Louis Gerstner re-directed research toward more commercial applications in the mid-90s ... A more recent example is DuPont’s closing of its Central Research & Development Lab in 2016. Established in 1903, DuPont research rivaled that of top academic chemistry departments. In the 1960s, DuPont’s central R&D unit published more articles in the Journal of the American Chemical Society than MIT and Caltech combined. However, in the 1990s, DuPont’s attitude toward research changed and after a gradual decline in scientific publications, the company’s management closed its Central Research and Development Lab in 2016.
Arora et al point out that the rise and fall of the labs coincided with the rise and fall of anti-trust enforcement:
Historically, many large labs were set up partly because antitrust pressures constrained large firms’ ability to grow through mergers and acquisitions. In the 1930s, if a leading firm wanted to grow, it needed to develop new markets. With growth through mergers and acquisitions constrained by anti-trust pressures, and with little on offer from universities and independent inventors, it often had no choice but to invest in internal R&D. The more relaxed antitrust environment in the 1980s, however, changed this status quo. Growth through acquisitions became a more viable alternative to internal research, and hence the need to invest in internal research was reduced.
Lack of anti-trust enforcement, pervasive short-termism, driven by Wall Street's focus on quarterly results, and management's focus on manipulating the stock price to maximize the value of their options killed the labs:
Large corporate labs, however, are unlikely to regain the importance they once enjoyed. Research in corporations is difficult to manage profitably. Research projects have long horizons and few intermediate milestones that are meaningful to non-experts. As a result, research inside companies can only survive if insulated from the short-term performance requirements of business divisions. However, insulating research from business also has perils. Managers, haunted by the spectre of Xerox PARC and DuPont’s “Purity Hall”, fear creating research organizations disconnected from the main business of the company. Walking this tightrope has been extremely difficult. Greater product market competition, shorter technology life cycles, and more demanding investors have added to this challenge. Companies have increasingly concluded that they can do better by sourcing knowledge from outside, rather than betting on making game-changing discoveries in-house.
They describe the successor to the labs as:
a new division of innovative labor, with universities focusing on research, large firms focusing on development and commercialization, and spinoffs, startups, and university technology licensing offices responsible for connecting the two.
An unintended consequence of abandoning anti-trust enforcement was thus a slowing of productivity growth, because the this new division of labor wasn't as effective as the labs:
The translation of scientific knowledge generated in universities to productivity enhancing technical progress has proved to be more difficult to accomplish in practice than expected. Spinoffs, startups, and university licensing offices have not fully filled the gap left by the decline of the corporate lab. Corporate research has a number of characteristics that make it very valuable for science-based innovation and growth. Large corporations have access to significant resources, can more easily integrate multiple knowledge streams, and direct their research toward solving specific practical problems, which makes it more likely for them to produce commercial applications. University research has tended to be curiosity-driven rather than mission-focused. It has favored insight rather than solutions to specific problems, and partly as a consequence, university research has required additional integration and transformation to become economically useful.
In Sections 5.1.1 through 5.1.4 Arora et al discuss in detail four reasons why the corporate labs drove faster productivity growth:
  1. Corporate labs work on general purpose technologies. Because the labs were hosted the leading companies in their market, they believed that technologies that benefited their product space would benefit them the most:
    Claude Shannon’s work on information theory, for instance, was supported by Bell Labs because AT&T stood to benefit the most from a more efficient communication network ... IBM supported milestones in nanoscience by developing the scanning electron microscope, and furthering investigations into electron localization, non-equilibrium superconductivity, and ballistic electron motions because it saw an opportunity to pre-empt the next revolutionary chip design in its industry ... Finally, a recent surge in corporate publications in Machine Learning suggests that larger firms such as Google and Facebook that possess complementary assets (user data) for commercialization publish more of their research and software packages to the academic community, as they stand to benefit most from advances in the sector in general.
    My experience of Open Source supports this. Sun was the leading player in the workstation market and was happy to publish and open source infrastructure technologies such as NFS that would buttress that position. On the desktop it was not a dominant player, which (sadly) led to NeWS being closed-source.
  2. Corporate labs solve practical problems. They quote Andrew Odlyzko:
    “It was very important that Bell Labs had a connection to the market, and thereby to real problems. The fact that it wasn’t a tight coupling is what enabled people to work on many long-term problems. But the coupling was there, and so the wild goose chases that are at the heart of really innovative research tended to be less wild, more carefully targeted and less subject to the inertia that is characteristic of university research.”
    Again, my experience supports this contention. My work at Sun Labs was on fault-tolerance. Others worked on, for example, ultra high-bandwidth backplane bus technology, innovative cooling materials, optical interconnect, and asynchronous chip architectures, all of which are obviously "practical problems" with importance for Sun's products, but none of which could be applied to the products under development at the time.
  3. Corporate labs are multi-disciplinary and have more resources. As regards the first of these, the authors use Google as an example:
    Researching neural networks requires an interdisciplinary team. Domain specialists (e.g. linguists in the case of machine translation) define the problem to be solved and assess performance; statisticians design the algorithms, theorize on their error bounds and optimization routines; computer scientists search for efficiency gains in implementing the algorithms. Not surprisingly, the “Google translate” paper has 31 coauthors, many of them leading researchers in their respective fields
    Again, I would agree. A breadth of disciplines was definitely a major contributor to PARC's successes.

    As regards extra resources, I think this is a bigger factor than Arora et al do. As I wrote in Falling Research Productivity Revisited:
    the problem of falling research productivity is like the "high energy physics" problem - after a while all the experiments at a given energy level have been done, and getting to the next energy level is bound to be a lot more expensive and difficult each time.
    Information Technology at all levels is suffering from this problem. For example, Nvidia got to its first working silicon of a state-of-the-art GPU on $2.5M from the VCs, which today wouldn't even buy you a mask set. Even six years ago system architecture research, such as Berkeley's ASPIRE project, needed to build (or at least simulate) things like this:
    Firebox is a 50kW WSC building block containing a thousand compute sockets and 100 Petabytes (2^57B) of non-volatile memory connected via a low-latency, high-bandwidth optical switch. ... Each compute socket contains a System-on-a-Chip (SoC) with around 100 cores connected to high-bandwidth on-package DRAM.
    Clearly, AI research needs a scale of data and computation that only a very large company can afford. For example, Waymo's lead in autonomous vehicles is based to a large extent on the enormous amount of data that has taken years of a fleet of vehicles driving all day, every day to accumulate.
  4. Large corporate labs may generate significant external benefits. By "external benefits", Arora et al mean benefits to society and the broader economy, but not to the lab's host company:
    One well-known example is provided by Xerox PARC. Xerox PARC developed many fundamental inventions in PC hardware and software design, such as the modern personal computer with graphical user interface. However, it did not significantly benefit from these inventions, which were instead largely commercialized by other firms, most notably Apple and Microsoft. While Xerox clearly failed to internalize fully the benefits from its immensely creative lab ... it can hardly be questioned that the social benefits were large, with the combined market capitalization of Apple and Microsoft now exceeding 1.6 trillion dollars.
    Two kinds of company form these external benefits. PARC had both spin-offs, in which Xerox had equity, and startups that built on their ideas and hired their alumni but in which they did not. Xerox didn't do spin-offs well:
    As documented by Chesbrough (2002, 2003), the key problem there was not Xerox’s initial equity position in the spin-offs, but Xerox’s practices in managing the spin-offs, which discouraged experimentation by forcing Xerox researchers to look for applications close to Xerox’s existing businesses.
    But Cisco is among the examples of how spin-offs can be done well, acting as an internal VC to incentivize a team by giving them equity in a startup. If it was successful, Cisco would later acquire it.

    Sun Microsystems is an example of exceptional fertility in external startups. Nvidia was started by a group of frustrated Sun engineers. It is currently worth almost 30 times what Oracle paid to acquire Sun. It is but one entry in a long list of such startups whose aggregate value dwarfs that of Sun at its peak. As Arora et al write:
    A surprising implication of this analysis is that the mismanagement of leading firms and their labs can sometimes be a blessing in disguise. The comparison between Fairchild and Texas Instruments is instructive. Texas Instruments was much better managed than Fairchild but also spawned far fewer spin-offs. Silicon Valley prospered as a technology hub, while the cluster of Dallas-Fort Worth semiconductor companies near Texas Instruments, albeit important, is much less economically significant.
    An important additional external benefit that Arora et al ignore is the Open Source movement, which was spawned by Bell Labs and the AT&T consent decree. AT&T was forced to license the Unix source code. Staff at institutions, primarily Universities, which had signed the Unix license could freely share code enhancements. This sharing culture grew and led to the BSD and Gnu licenses that underlie the bulk of today's computational ecosystem.
Jamie Powell was right that Arora et al have produced an extremely valuable contribution to the study of the decay of the vital link between R&D and productivity of the overall economy.

The importance of libraries and access to information in Zimbabwe: Open Data Day 2020 report / Open Knowledge Foundation

On Saturday 7th March 2020, the tenth Open Data Day took place with people around the world organising over 300 events to celebrate, promote and spread the use of open data. Thanks to generous support from key funders, the Open Knowledge Foundation was able to support the running of more than 60 of these events via our mini-grants scheme

This blogpost is a report by Shadreck Ndinde from the Zimbabwe Library Association who received funding from the Foreign and Commonwealth Office to highlight the importance of open data in promoting and supporting the girl child as well as raising the negative effects of gender-based violence against women and the role that libraries can play in providing current awareness to communities.

The Zimbabwe Library Association (ZimLA) through one of its branches in Masvingo Province successfully hosted the international open data event at Chirichoga High School in Masvingo Province on the 7th of March 2020.

Parents, members of the community and peer groups met under the theme ‘Data for Equal Development’. Open Data Day is an annual event celebrated internationally on the 7th of March. Zimbabwe was among several countries in Africa selected by the Open Knowledge Foundation to conduct the workshop. 

ZimLA’s event highlighted the importance of open data in promoting and supporting the girl child as well as raising the negative effects of gender-based violence against women. It displayed the role that libraries can play in providing current awareness to communities and how these communities can access information.

The main thrust of our event was to educate the girl child on the importance of open data. Data or information should be always available, free and accessible. This would benefit the girl child especially those in rural areas who are deprived their rights through early marriages, rape, forced marriages and domestic violence. The association plays a part in the global fight for gender equality. ZimLA seeks to promote awareness on the Beijing Declaration and Platform for Action (1995). The event created an opportunity for the librarians do demonstrate the role played by libraries in delivering on the Sustainable Developmental Goal number 5, ‘Achieve gender equality and empower all women and girls’ in line with the International Federation of Library Associations and Institutions (IFLA) United Nations (UN) Agenda 2030.

Students presented a number of songs and poems that highlighted the importance of open data to the girl child. The event also created an opportunity for policy makers to discuss how libraries can provide free open data and deliver on the UN SDGs and national vision 2030 as well as the Africa vision 2063. 

The headmaster Mr Masomere gave opening remarks at the event and called upon libraries to support development and awareness. The library was both a vital resource for students and served as a fact-driven organisation.

Representing the National Executive Council (NEC), Mr Praymore Tendai gave a presentation that highlighted that 32% of girls in Zimbabwe were married off before they attained 18 years of age. 4% were married before they turned 15 and approximately 1 in every 3 girls was married for the age of 18. The girl child was the most vulnerable member at household level. He also spoke about the role of open data in the development of Zimbabwe and how the lack of information was a main cause for the vice. The delegates also discussed how libraries may be more effective in dealing with the social inequality and child marriages. 


Member of Parliament Masvingo West Constituency and Child President of the Chiefs Parliament of Zimbabwe, Ms Tsitsi Mutyaka thanked the national library association for bringing the event to rural communities like Nemwanwa where girls are normally left out. “We learned the importance of open data in libraries, we learnt on how we can use the open data in improving our lives and changing our lives for the better and stopping early child marriages, early pregnancies and gender based violence,” said Ms Mutyaka. She challenged the community authorities to create space for libraries and leisure for children.

The National Treasurer, Mr Poterai, later presented on open data and community libraries. He highlighted the role played by community libraries in providing affordable and free information to their patrons. Delegates got a chance to network and interact with the students at the end of the presentations. There were a number of discussions that saw the event closing past the set time of 3pm on the programme.

Collection directions accelerated? Pandemic effects. / Lorcan Dempsey

Over the past few years I have been talking about three systemic ways in which collections, broadly understood, are evolving in a network environment. They are: the collective collection, the facilitated collection, and the inside-out collection. In different ways, each moves beyond the carefully constructed and locally acquired collection. I believe that we will see accelerated broad adoption of these … Continue reading Collection directions accelerated? Pandemic effects.

The post Collection directions accelerated? Pandemic effects. appeared first on Lorcan Dempsey's Weblog.

Teaching girls about environmental data in Tanzania: Open Data Day 2020 report / Open Knowledge Foundation

On Saturday 7th March 2020, the tenth Open Data Day took place with people around the world organising over 300 events to celebrate, promote and spread the use of open data. Thanks to generous support from key funders, the Open Knowledge Foundation was able to support the running of more than 60 of these events via our mini-grants scheme

This blogpost is a report by Dr Hector Mongi from the University of Dodoma in Tanzania who received funding from the Foreign and Commonwealth Office to invite girls from a local school to a geospatial open data networking event to instil environmental thinking among young girls.

The University of Dodoma (UDOM) in Tanzania with support from the Open Knowledge Foundation (OKF) and its partners, organised our Open Data Day activity on March 7, 2020.

The University invited young girls from Dodoma Secondary School to celebrate the day through a networking session that ended with a demonstration of open environmental data resources and some hands-on practices. Fifty girl students accompanied by their teacher, Ms Mwajuma Musa, together with an enthusiastic team of UDOM Library staff gained knowledge about open data and open environmental data resources.

The opening session was chaired by Dr Hilda Mwangakala, Deputy Principal of College of Informatics and Virtual Education of UDOM. Before her inspirational opening talk, I briefed Dr Mwangakala and the rest of participants about Open Data Day, where it was being celebrated globally and how the participants would benefit from the sessions.

Kicking-off the networking session, Ms Agatha Mashindano, the Coordinator for Institutional Repository at UDOM, introduced the students to the open access databases, zooming in the UDOM Institutional repository and other available free online resources.

Later I took the participants through more open databases and other open resources for environmental data. The first one was Resource Watch, then NASA Open Data, Global Forest Watch, and ESRI’s COVID-19 Hub.

During the session, Dr Grace Msoffe who is Director of Library Services at UDOM and the host of the event, encouraged the young girls to pursue their dreams because they are able. She led them to sing and dance a song “I am a Superwoman” that was composed by Tanzanian Women All Stars.

After introduction topics on open data for the environment, participants performed hands-on exercises with guidance from the library staff and other technical personnel. The presentations and hands-on session was followed by a library tour where the participants were shown the many collections on informatics, virtual education, earth sciences and engineering available at the UDOM Library

Migrant women, housing activism and collective mapping in Spain: Open Data Day 2020 report / Open Knowledge Foundation

On Saturday 7th March 2020, the tenth Open Data Day took place with people around the world organising over 300 events to celebrate, promote and spread the use of open data. Thanks to generous support from key funders, the Open Knowledge Foundation was able to support the running of more than 60 of these events via our mini-grants scheme

This blogpost is a report from the TuTela Learning Network in Spain who received funding from Mapbox to map the housing situation of migrant women in Granada.

On the 7th of March, as part of Open Data Day, we organised a workshop with migrant women. We discussed our experiences of housing in the city of Granada, Spain and learned about the relevance of open data in housing initiatives. 

The event was organised in a collaboration between the TuTela Learning Network, an international multilingual open initiative for the exchange between marginalised activist experiences, and Colectivo Sirirí, an association of migrant women in Granada. The facilitators of the event were: Kitti Baracsi, Diana Carolina Escobar Blanco and Dresda Emma Méndez de la Brena. The event took place in the Biblioteca Social hnos Quero, a self-organised community library, to which we are grateful for hosting our event. 

The event aimed to raise awareness about the importance of open data and at the same time to boost the dialogue about the housing situation of migrant women in the city of Granada and open the ground for further encounters and collaborations. Several initiatives work on housing rights; however, we aimed to create a space where migrant women of very diverse backgrounds and experiences can meet and talk for themselves. 

When we announced the event, it attracted significant interest but many of the inquiries came from persons who did not have a migratory experience. Therefore, we proposed to organise other encounters in the future for a wider public. We have realised the workshop with a smaller group of migrant women, most of them relatively newcomers in the city. 

The encounter had two parts: the first aimed to share our experiences and feelings about our housing situation. First, through a drawing exercise, we made connections between our body and housing, acknowledging that our body is the first territory that we inhabit. The activity led to a reflection on how the conditions in which we live impact our mental and physical wellbeing. Then, with an acting exercise, we went on to discuss housing justice by having a closer look at the inequalities and the underlying mechanisms. We have concluded the discussion with a collective mapping on our housing experiences in Granada. 

We dedicated the second part of the workshop to learn about what open data means as it was a relatively new concept to various participants. Together we discovered different open mapping tools and initiatives in housing activism that use those tools. For instance, we have learnt about the Anti-Eviction Mapping project from San Francisco. 

We have decided to create a mailing list and take the initiative of an open mapping project about housing experiences, with particular attention to positive housing experiences of migrant women in the city of Granada. The idea was to call it the map of “casas acogedoras” and work towards the creation of a support network along with the mapping. We have also decided to support other collectives of migrant women in replicating the workshop and collaborate on this initiative. 

(Due to the lockdown in Spain and the energy we had to dedicate to other solidarity projects, we had to suspend this process. We will proceed after the lockdown as we see it of special importance, especially in the current situation. In addition, we plan to organise events that connect different questions from local activists and open a space for learning about the potential of open mapping projects collectively.)

Julie Allinson (1974-2020) / Samvera

A message from Jon Dunn, Chair of the Samvera Steering Group:

On behalf of the Samvera Steering Group and Richard Green our Operations Adviser, I want to express our collective deep sorrow for the death of our friend, colleague, and fellow Steering Group member, Julie Allinson. As many of you know, Julie has been an integral part of Samvera and the broader repository community for many years in her time at the University of York, CoSector, and most recently at Notch8. We know that her loss will be felt deeply at both a personal and professional level within our community. We will all miss her positive can-do attitude, wry sense of humor, and thoughtful dedication to the success of Samvera and its users.

Our thoughts are with Julie’s wife Louise, her family, her Notch8 colleagues, and all who have been lucky to have Julie as a part of their life.

The post Julie Allinson (1974-2020) appeared first on Samvera.

Recognition doesn’t have to be a zero sum game / Meredith Farkas


As usual, the week the 2020 Library Journal Movers and Shakers were announced, I saw plenty of complaints about the award and, in some cases, awardees. I’ve been reading this sort of hurtful negativity since 2006 when I was named a Mover and Shaker (and a friend of mine wrote a blog comment calling us “the high-profile, self-promoting elite”) and it’s the same thing every single year. I’ve written Twitter threads in response to this in the past to help make people aware of how terrible it feels to see people disparaging the legitimacy of an award you just received (or, worse, claiming that “certain people” didn’t deserve it). But I think this also reveals many people’s insecurities in ways they didn’t necessarily intend. 

Last week, there was an interesting example of this from the world of lifestyle influencers (a term that honestly makes me nauseous). Alison Roman is a cookbook author whose recipes are frequently featured in the New York Times. I’ve never made any of them, but I’ve seen people all over social media raving about her recipes. In a recent profile she talked about her desire to be “bigger” than she is, but without giving up control over her brand and the charm that come with being small-time. Instead of highlighting people who have been able to walk that fine line, she instead decided to attack two women of color, Marie Kondo and Chrissy Tiegen, whom she perceives as having sold out because they sell products. The racial and gender dynamics of this were not lost on anyone, but what really struck me was how deeply, deeply insecure Alison Roman must be to feel the need to denigrate others to make herself feel more virtuous. What she did was totally shitty and racist, but it was also a window into how awful she must feel about herself inside her own head. And, while Alison might have thought Chrissy Tiegen was too big and famous to have feelings, surprise(!?) everyone does.

And Alison wrote about her insecurities in her eventual apology to Chrissy and Marie (after some pretty ridiculous tweets about HER being the one who was bullied and her words being a critique of capitalism — I’m so TIRED of people making excuses for being jerks to individuals by saying they’re “critiquing” larger things). She talked about her “inability to appreciate my own success without comparing myself to and knocking down others” and how her comments “were rooted in my own insecurity.” Damn right. And we can say “wow, she’s terrible,” but I think most of us have had those thoughts at some point. “Why has x achieved so much and I haven’t?” Or “why did they get this award when I did x and no one cared?” Or “does x really deserve y more than I do?” We might not express it in a magazine or even on social media, but comparing ourselves to others is a deeply human thing to do. It’s also really counterproductive and can even be toxic. It certainly was toxic for me.

In the midst of the Mover and Shaker vitriol (and some lovely congratulations!) was an incredibly thoughtful and wise Twitter thread by Annie Pho that both validated the negative things people were feeling and asked them to reflect on why they feel that way:

I couldn’t agree more. Those feelings you’re feeling are so real and valid. But when you are having negative feelings about someone else’s success, it’s worth reflecting and considering why that is. Usually what you’re feeling is less about that person and more about something in your life that you’re feeling dissatisfied with or anxious about or something that is systemically problematic.

I know what it’s like to want recognition, but I don’t know anyone, including myself, who ever really felt better after chasing and receiving external validation. It’s an unending treadmill of want; a sense that good things lie just a bit out of reach to make you believe that if you work just a little bit harder you’ll feel whole. I remember being someone who really wanted to be named a Library Journal Mover and Shaker because that’s what all the people I admired had achieved. Yet when I was recognized, I actually found myself feeling worse. It brought up all of my feelings of insecurity and unworthiness. And people who were jealous were great at validating my unworthiness. When I attended a conference about a month after becoming a Mover and Shaker, a “friend” of mine, instead of congratulating me, went on and on about all the amazing projects she’d done over the past few years and asked me why she didn’t get it and I did. The next time you think of denigrating an award someone won or questioning whether someone was deserving of recognition, please remember that the honorees have feelings and insecurities, just like you. You might just be confirming to them all of their worst fears about themselves. 

Usually, people who seek external recognition are trying to fill some hole. Maybe they think it will fill the yawning chasm of need for love and approval inside them. Maybe they think it’ll make them feel less insecure; like they finally belong and aren’t an impostor. Maybe they think it will make up for the lack of recognition they receive at work or the toxic workplace they toil in day after day. Whatever the reason, those issues will still be there even after you receive an award. In my case, they only magnified them. Almost no one at my place of work congratulated me for being named a Mover and Shaker, which only added to my impostor syndrome. Clearly if I was worth anything or deserved the recognition, people would congratulate me, right? 

Dunking on other people who have achieved things I haven’t has never made me feel better about myself. In the end, for most of us, the approval we really need is from ourselves. And when I decided to stop seeking external validation and focus on work that makes me feel good, I felt a whole hell of a lot better. I don’t win awards anymore. My boss rarely praises anything I do. I don’t get invited to speak at lots of conferences. Yet I feel like the work I’m doing now is some of the most valuable and student-centered of my career. I’m proud of myself. And really, all this time (after all this striving and pushing to achieve and making myself miserable), my own approval was just what I needed. Go figure.

I can also tell you that congratulating people and recognizing their awesomeness feels a whole hell of a lot better than judging or dunking on people. This year, I’ve been writing notes to colleagues thanking them for various and sundry things. People so rarely get recognition for important but unsexy things like creating awesome equitable processes, or developing useful documentation, or being the type of person who always helps out, or just generally being a loyal, kind, and encouraging colleague. I hope getting the notes brightened their days, because writing them made me feel great. I’ve never felt good being snarky about someone else.

That said, awards can be really problematic, but that’s more about the award criteria, who is judging it, or how they are recruiting candidates.  I’ve criticized our professional awards on my blog and in American Libraries and I have a lot of issues about how and what we choose to reward. There are lots of ways we can criticize the problems with awards without targeting the people who receive them. #oscarssowhite wasn’t saying that Tom Hanks’ acting doesn’t deserve recognition, but that there are amazing performances by BIPOC actors, directors, costume designers, editors, etc. who deserve acclaim as well. And it brought up larger systemic issues that prevent BIPOC and women from ever having the opportunities to showcase their skills in ways that could be rewarded. I’ve seen the LJ Movers and Shakers become a more diverse group over the years (though they still have plenty of room for improvement) and I think that came from criticisms about how white previous groups of honorees were. That is positive change.

I think it’s valuable for our profession to examine how we reward people and also how (speaking, writing, etc.) opportunities are given out. I’ve seen things improve since I entered the profession — fewer all-male panels, more women of color presenters, more BIPOC scholarly authors — but there is still a lot of favoritism that keeps the same people in privileged positions. While some decisions about who speaks are decided by diverse committees, some conferences are run by the same few people year after year who invite their favorite people to speak over and over again. I was one of the “usual suspects” who spoke at a certain conference for a few years and I became increasingly uncomfortable with the clubby cliquishness of it all, so I stopped attending. There are journals that are run the same way. It’s this sort of insular clubbiness that allowed a racist and casually sexist essay to end up in Against the Grain. Against the Grain did retract the article, but in no way addressed how it ended up in the publication nor what they will do in the future to prevent it from happening. But, really, we all know how it happened because we’ve seen it before. I know that I would never write for a publication that would publish racist garbage and then avoid discussing how they will do better in the future.

In the end, everyone deserves recognition for their good work and kindness in the workplace or their service to the profession. I don’t understand why people are often so stingy with praise and recognition; it’s not as if it’s hard to give. Expressing gratitude is truly the easiest thing in the world to do and gratitude is an endlessly renewable resource. I’ve learned from generous role models and I try to model that in my own life, though I’m far from perfect. 

Friend, you are awesome. You deserve good things. You deserve recognition for all the great stuff you’re doing. But I promise you, no award is going to make you feel better or more deserving or like less of an impostor. The approval you’re craving needs to come from you. You’ve got this. And go tell someone else they’re awesome today! And nominate people whose work you admire for awards! 

Image credit: cc-by-sa 2.0 by on Flickr. Image description: Dog pointing at you saying “Who’s awesome? You’re awesome.”

Bringing together the open data community in Russia: Open Data Day 2020 report / Open Knowledge Foundation

On Saturday 7th March 2020, the tenth Open Data Day took place with people around the world organising over 300 events to celebrate, promote and spread the use of open data. Thanks to generous support from key funders, the Open Knowledge Foundation was able to support the running of more than 60 of these events via our mini-grants scheme

This blogpost is a report by Kseniya Orlova from Infoculture in Russia who received funding from the Open Contracting Partnership to hold a conference on open data and information transparency.

Infoculture marks Open Data Day 2020 in Moscow, Russia

On March 6th and 7th 2020, a significant conference took place in Moscow to mark Open Data Day. The conference brought together 576 participants and 78 speakers with different interests – data analytics, data science, data visualisation and data journalism. The event organisers were our NGO Infoculture and the Association of Data Market Participants.

The main topic of the programme was the restart of openness and the role of open data in modern Russia. One of the guests was Alexey Kudrin, Chairman of the Accounts Chamber of the Russian Federation, who greeted the participants and emphasised the importance of citizens’ access to truthful and objective information and data. 

The programme included 24 different discussions, workshops and case studies around important topics: 

  • How open public finances should be organised 
  • Why business and citizens need access to data from the government information systems
  • How open data supports the work of socially-oriented NGOs
  • The future of geospatial data in Russia
  • How to research with open data the life quality in Russian cities
  • Best practices of data-driven newsrooms in the Russian media
  • Best cases in data design and visualisation
  • Several workshops on data analysis, e.g. COVID-19: How to model the spread of the coronavirus?

Infoculture marks Open Data Day 2020 in Moscow, Russia

Thanks to the support of the Open Knowledge Foundation, InfoCulture compensated for the travel and accommodation of three speakers from different cities of Russia. We were also happy that there were several award ceremonies during our Open Data Day conference:

  • InfoCulture announced the winners of the first micro-grants programme for the support of projects based on open data and open source code
  • The winners of the first All-Russian Data-Visualisation Award were also announced. The participants could watch an exhibition of the best infographics and visualisation works during two days of the conference

From this experience, we see how the open data community in Russia is expanding and becoming more developed year after year. We are confident that future Open Data Days will help us to synchronise the agenda, and also record and celebrate the success of the Russian open data community.

A climate data sprint in Germany focused on renewable energy: Open Data Day 2020 report / Open Knowledge Foundation

On Saturday 7th March 2020, the tenth Open Data Day took place with people around the world organising over 300 events to celebrate, promote and spread the use of open data. Thanks to generous support from key funders, the Open Knowledge Foundation was able to support the running of more than 60 of these events via our mini-grants scheme

This blogpost is a report by Aileen Robinson from WikiRate in Germany who received funding from Datopian to engage the public in the research and collection of open data about how companies are impacting climate change.

Wikirate celebrates Open Data Day 2020 in Berlin, Germany

Making data about corporate environmental and social impacts open and accessible to all is the core of our work at WikiRate. This March we were delighted to organise a climate data sprint as part of Open Data Day, an annual celebration organised by the Open Knowledge Foundation to champion open data across the world. 

The aim of the sprint was to engage the public in a deep-dive look at corporate renewable energy commitments and performance as part of our project to collect environmental data about the top 100 corporate emitters and better understand their impacts. 

The project

Last year we launched a new open data project in collaboration with Plan A to collect environmental impact data on the 100 most greenhouse gas emitting companies in the world, as set out in the Climate 100+ list. We began this project by collecting emissions data on these companies, and soon broadened the scope to also include data on corporate policies and energy usage. Due to the complexity of capturing comparable data on renewables, we decided to frame our data sprint around this topic. 

The transition to renewable energy is recognised as a necessity if we are to lower our dependence on fossil fuels and reduce emissions across the world. Collecting and tracking the ways in which companies are delivering on this goal is a complex task. For us, the key question was: how can we best leverage public data to compare the performance and commitments to renewable energy transition of the companies? With this question in mind, we began preparations for the event. 

The open data sprint

We invited members of the public to take part in a data sprint in Berlin on the 6th of March to help us frame the research and start collecting open data on some of the top emitters. We started by setting up some key metrics on the WikiRate platform to test out during the event. These included Global Reporting Initiative metrics on energy and fuel usage, as well as some new metrics on renewable energy usage and renewable energy commitments. Our partner organisation, Ecosia, kindly offered to host the event in their office and to give some insights on how they have used information like this through their Green Leaf project

On a cold and wet Berlin day, we were joined by 30 attendees who generously gave their time to brainstorm the topic and add open data about the companies on the WikiRate platform. The attendees came from a diversity of backgrounds including sustainability professionals, renewable energy experts, students and data scientists. We were also joined by Pascal Tsachouridis, representative of Naturstrom, who contributed his valuable expertise on the subject.

Wikirate celebrates Open Data Day 2020 in Berlin, Germany

The findings

Some of the key observations from the event focused on the accessibility of data. Many of the companies researched by the attendees did publish some data on their renewable energy usage and commitments but the data was patchy and difficult to get to – mostly hidden in hundred-page sustainability reports. 

Another issue which came up again and again was a lack of transparency in energy performance reporting. Most of the companies did not provide a methodology or use a reporting standard for the calculation of their energy usage. Similarly, companies used different energy units to report their energy consumption. However the meaning of the abbreviations was not always published. This meant the researchers were left unsure whether the manner in which companies reported their energy usage had left them comparing apples with oranges. 

On the topic of commitments, 73% of the companies researched did have some kind of commitment towards renewable energy transition. The attendees were able to collect some structured data on those companies who made specified targets on renewables, the problem lay with the companies whose commitments were too vague to make any sort of comparison with their energy performance.

The research continues

The research does not stop here. Throughout this year we will be continuing to engage the public in collecting open data on the climate impacts and renewable energy performance of the companies. The data sprint provided an ideal jumping-off point for narrowing in on what data is out there, and how we can structure the collection of this data in a way that makes it truly accessible and comparable. 

Free and open data will help us to understand companies’ impacts and to fight climate change on many fronts: 1) governments and analysts can take better decisions about emissions regulations and thresholds, 2) companies can improve their performance, 3) investors can more accurately assess sustainable investments and, 4) consumers are empowered to make sustainable purchasing and lifestyle decisions, and influence environmentally friendly policies.

We look forward to continuing this work in the coming months, and would urge anyone reading who would like to be involved to get in touch with our team

A final word to say thank you to everyone who donated their valuable time to take part in the data sprint, to our event partners Plan A and Ecosia and to the Open Knowledge Foundation for supporting the event. We’d also like to give a shout-out to Restaurant Sotto who fuelled the research with some delicious vegetarian and vegan pizzas!

The Climate Change & Renewable Energy crowd-research project is active on the WikiRate platform. Do you want to help us with this project? You can sign up as a volunteer researcher and start contributing right away.

Discussing the lack of public data in Namibia: Open Data Day 2020 report / Open Knowledge Foundation

On Saturday 7th March 2020, the tenth Open Data Day took place with people around the world organising over 300 events to celebrate, promote and spread the use of open data. Thanks to generous support from key funders, the Open Knowledge Foundation was able to support the running of more than 60 of these events via our mini-grants scheme

This blogpost is a report from We Are Capable Data for Good Namibia (WACDGN) who received funding from the Foreign and Commonwealth Office to train young Namibians in using data science skills for sustainable development projects.

WAC Namibia marks Open Data Day 2020

WAC Namibia in collaboration with the Faculty of Computing and Informatics in NUST, organised an Open Data Day event which was attended by people from all sectors/industry with a total of 68 participants. The event was funded by ISOC Namibia and the Open Knowledge Foundation.

The events had four speakers:

  • Lameck Amugongo who is an open data pioneer
  • Jacobine Amutenya from WAC Namibia
  • Royal Mabakeng Land Management Departments
  • Rocky Crest Higher School learners

The event included a demonstration on what open data is all about. Participants were divided into groups and the first group emerged as the fastest team to answer all the questions in the quiz offered and was given a present. The day participants were shown how to navigate the Jupyter Notebook application, which aimed to engage participants further. The audience had a go at exploring available data and how to analyse that data using Jupyter notebooks.

There were discussions about making public data available in Namibia. Speakers mentioned that we need access to public data to do research. Speakers emphasised that we do not have an open data policy in Namibia and organisations need to comply to achieve this goal.

Exploring budget data in Cameroon: Open Data Day 2020 report / Open Knowledge Foundation

On Saturday 7th March 2020, the tenth Open Data Day took place with people around the world organising over 300 events to celebrate, promote and spread the use of open data. Thanks to generous support from key funders, the Open Knowledge Foundation was able to support the running of more than 60 of these events via our mini-grants scheme

This blogpost is a report from Afroleadership in Cameroon who received funding from Hivos to organise a training on the analysis of budget data by civil society using open data.

Afroleadership marks Open Data Day 2020 in Cameroon

Open Data Day 2020 Cameroon was held in Yaoundé and saw the mobilisation of around forty participants made up of civil society organisations, journalists and activists in the field of governance, peace journalism and civic technologies. The workshop started with an introduction on open data by Charlie Martial NGOUNOU, the president of AfroLeadership. 

Starting with the symbolism of the celebration of Open Data Day, he explained what open data is, using Tim Berners-Lee’s 5 Star Linked Data model. He finally called attention to Law No.2019/024 of 24 December 2019 which called on public authorities in Cameroon to publish data on websites accessible to local populations, a great opportunity for civil society organisations doing advocacy for open data.

After these remarks, the floor was given to Mr. Guy Merlin TATCHOU. Guy took the participants through an introduction to the OpenSpending platform, applying some of the open data principles introduced above on a dataset from local governments. He made a slight technical demonstration of the necessary steps to populate the platform, and present information in a way that is useful for citizens. Participants recognised Open Spending as a great tool for transparency and accountability, and requested to be more empowered in the future. 

Next, Mr. Cyprien TANKEU, AfroLeadership’s civic tech director, started his presentation with a demonstration of an example of visualisation of macroeconomic data with Google Public Data which is a platform for aggregating public information. He then moved on to develop a case on geographical data. He finally presented a use of OpenStreetMap. 

The d-portal is a platform that traces the flow of financial data from donors, using the International Aid Transparency Initiative (IATI) Standard. Charlie taught participants how to make queries into the d-portal and to retrieve information according to several indicators. It gives evidence to CSOs, donors and governments to inspect international aid spending against the reality in countries in various sectors (education, agriculture, health, etc.).

As in 2019, Cameroon was part of the Open Data Day 2020. At the end of the workshop, participants requested to be part of the Open Budgets Network and expressed the desire to be trained extensively on how to use the Open Spending platform and the d-portal for advocacy on transparency and accountability.

Designing a zine about environmental data in Mexico: Open Data Day 2020 report / Open Knowledge Foundation

On Saturday 7th March 2020, the tenth Open Data Day took place with people around the world organising over 300 events to celebrate, promote and spread the use of open data. Thanks to generous support from key funders, the Open Knowledge Foundation was able to support the running of more than 60 of these events via our mini-grants scheme

This blogpost is a report by Sofia Shapiro from Técnicas Rudas in Mexico who received funding from Resource Watch to collectively explore Mexico’s mandated public data on construction projects and their environmental impacts.

To contextualise the state of environmental open data in Mexico, we can look at the case of the exploding mining industry. In 2011, the Wixárica people started a battle against the Canadian mining company First Majestic Silver Corp which decided to do an open-pit mining project in Cerro del Quemado, a sacred place for the Wixáricas. One of the big questions that existed at that time was where the exact location of the mining operation would be – information that by law should be public, but that the Ministry of Economy kept secret.

Public requests to the National Institute of Access to Information (INAI, formerly IFAI) to see the number of mining operation applications and their geographical locations, were for many years “kidnapped” in the hands of the government, companies and some individuals. It took until 2014 for the Ministry of Economy to deliver the file with the geographical coordinates of the mining operations in an information request to INAI, making it available as “open data” (Sánchez, 2014).

However, to this day there are still many communities that are unaware that their land has already been granted to a company for ecologically devastating mining practices. Part of the reason for this is that even though now the Ministry of Economy supposedly makes information on mining concessions available via their online portal – CartoMinMex (the website currently seems to be down) – the most complete and recent information about mining is not available as open data. It is only possible to view it but not to download it. The second part is that the cost of developing the portal was $855,639.20 in contracts to ArcGIS and $3,045,817.80 in update, maintenance, and support subscriptions for the ArcGIS tools, favouring a large sum to proprietary tools over free and open source tools. This example is illustrative of the general state practice around environmental open data, including other types of projects from pipelines to water sanitation facilities. 

In light of the deep lack of accessibility and reliability of this data, as well as the importance of this information to the health of our land, water, and bodies, we invited members of the community to come together to discuss Mexico’s environmental open data resources and what we are lacking. 

Our March 7th 2020 event was held at El Foro Cultural Karuzo and was hosted by Técnicas Rudas, an organisation dedicated to the intersection of technology, feminism, and human rights, and publicised by Lado B, an independent journalism organisation.

More than a presentation, the event was a collaborative space to explore the data, ask questions and amplify the accessibility of these open data resources in our community. We considered what makes data about construction projects good or trustworthy, why we need it and what is lacking in the data provided by the government. And ultimately we asked how to make this kind of data truly accessible.

As an exercise in community knowledge and collaborative creation, we made a (rough first draft) zine about open data – specifically data related to the environment and resource extraction in Mexico. It has information on what materials companies are legally required to submit in order to get a pipeline or another ecologically sensitive project approved by the state, and where to find this information. It also has a bibliography of other independent tools for exploring this data, such as, a tool made by Técnicas Rudas to host several open data projects. Amongst these is one project that specifically generates visualisations of preventive reports (ingresos preventivos) during the period of Felipe Calderón’s and Enrique Peña’s presidencies, using sources like the Ecological Gazette, the Environmental Impact portal of the Ministry of Environment and Natural Resources, as well as some official information for access requests.

This zine was a starting point of how to think about bringing this process of data collection, analysis and questioning outside of Open Data Day and into a long-term community practice of proactive involvement and protection of our resources. Together we saw the need for independent projects – both for accountability and publicity – in order to translate this open data into an actionable resource for protecting our environment. Without a vigilant public demanding that this data be open and reliable, the state has shown it simply will avoid the task. And without the perspective of the community the databases will simply not contain important data points we need to inform ourselves properly. 

To summarise, some of the open databases and tools we shared with each other were the following:

Government databases – General entry point for government open data on any topic Current site for all infrastructure/construction applications being processed by the The Secretariat of Environment and Natural Resources (Secretaría de Medio Ambiente y Recursos Naturales) – Ecological Gazette: list of all projects submitted and approved to the Ministry of the Environment – Digital library of the Secretariat of Environment and Natural Resources (Biblioteca digital de la SEMARNAT)

Other tools – Preventative impact reports submitted during the Calderón-Peña period and visualisation of mining concessions in Mexico (current plots of land, pending applications for future mining projects, and mining reserve areas – amongst other data points) – A tool designed to detect connections between companies in Latin America, used often to detect irregularities and potential sites of corruption between various corporate and state actors.

Economics Of Decentralized Storage / David Rosenthal

Almost two years ago, in The Four Most Expensive Words in the English Language , I wrote skeptically about the economics of decentralized storage networks. I followed up two months later with The Triumph Of Greed Over Arithmetic. Now, Got a few spare terabytes of storage sitting around unused? Tardigrade can turn that into crypto-bucks is Thomas Claiburn's report on initial experience with Tardigrade, the "decentralized" storage network from Storj Labs. Below the fold, some more skepticism.

First, lets look at pricing. Claiburn writes:
Last November, when pricing was announced, storing 1TB for a month cost $10, with egress bandwidth charged at $45 per TB. As a point of comparison, Amazon S3 charges $23 per TB for a month and $90 per TB, subject to significant variation. The Tardigrade network currently has about 19PB of capacity, 12PB of which are used, from about 6,500 platform participants.
But comparing decentralized storage to S3 is comparing apples and oranges. S3 is not simply a storage system. It is a part of a complex, integrated computation ecosystem, and is priced against the value that the ecosystem delivers.

The correct comparison is with storage-only services such as Backblaze and Wasabi. With that comparison, my skepticism looks justified. Wasabi charges $5.99/TB/month with $0/TB egress charge.  Backblaze charges $5/TB/month with $10/TB egress.

So, if you never access the data, Tardigrade is twice as expensive as the centralized competition. If you access 50% of the data each month, it costs $32.50/TB against Wasabi's $5.99, so more than 5 times as expensive. What exactly is the value Tardigrade adds to justify the extra cost to store data? Simply "decentralization"?

But, like all these cryptocurrency-based systems, Tardigrade's "decentralization" is more a marketing term than a practical reality. The money isn't decentralized, because customers pay Storj, who then pays a little of that money to the storage node operators (SNOs):
According to a company spokesperson, SNOs earn $1.50 per TB per month for static storage and $20 per TB for egress bandwidth, with the caveat that actual compensation varies based on network utilization, internet connection, and geographical location.
Storj keeps 56% of the bandwidth revenue. Their erasure coding scheme appears to have a 2.66 replication factor, so for every $10 TB a customer stores for a month, the network needs to pay SNOs at least $4. It isn't clear whether the $1.50 is for a TB of capacity, or a TB of stored data:
  • If it is stored data, Storj keeps 60% of the storage revenue but also controls how much of your capacity actually earns for you.
  • If it is capacity, Storj only keeps 36% of the storage revenue.
Either makes even Uber's advertised (but misleading) 25% look generous. Note that the $1.50 isn't even actual money:
For sustaining their nodes at 99.9 per cent availability, SNOs receive a token payment in STORJ tokens.
STORJ tokens are worth about $0.10 at present, though they must first be exchanged for another cryptocurrency like Bitcoin or Ethereum before they can be turned into a fiat currency like US dollars.
STORJ, with a "market cap" of $14.4M, is a rather small cryptocurrency, thus its value is highly volatile and easily manipulated. The SNOs bear these risks, and have to pay fees to trade into a larger cryptocurrency and more to trade into fiat. Even the company admits that, if you have to pay for the media, power, etc. the economics of running a node don't work:
In other words, Tardigrade isn't a storage-rental-to-riches scheme. Gleeson said the company does not recommend that people buy storage just to become a SNO. Rather, the network provides a potential way to defray the cost of existing storage hardware.
The company's ideal SNO is a data center that can't manage its procurement properly:
"There are a lot of partners with data centers that are underutilized and are seeing they can use that capacity to add to the profitability of their businesses," said Gleeson.
And, lastly, there are actually two kinds of nodes in the network, storage nodes and Satellites, which manage access to them:
At production launch, the Storj network will allow anyone to operate a Satellite in addition to the Tardigrade Satellites operated by Storj Labs. Being a Satellite operator is a big commitment with very stringent uptime requirements, much more so than a storage node operator. If a storage node goes offline temporarily, data can still be recovered from other nodes. If a Satellite goes offline temporarily, the data stored by that Satellite could become unavailable. Satellites are also responsible for managing payments to storage nodes storing data for the Satellite as well. 
Note that the Satellite through which your data was stored is a single point of failure.

Lastly, based on both theory and practice, one would expect that the vast majority of Satellites would be "operated by Storj Labs", so there would not really be "decentralization". This is, in fact, almost a legal requirement:
"If someone publicly shares illegal or copyrighted content and publicly shares it, if that is brought to our attention through a valid court order from law enforcement, we will comply with the laws and jurisdictions to remove that illegal content from the network that is connected to our Satellites."

"This hasn't happened yet, but we have a system in place to deal with these types of requests," he added. "However, we're decentralized and open source, so anyone can run a Satellite. We're working on a framework to help others who want to operate Satellites navigate this, as laws and regulations are very different across different regions and countries."
The history of ICOs shows the folly of assuming that "decentralization" is a magic shield protecting you from legal assault.

AI in the Library, round one / Andromeda Yelton

The San José State University School of Information wanted to have a half-course on artificial intelligence in their portfolio, and asked me to develop and teach it. (Thanks!) So I got a blank canvas on which to paint eight weeks of…whatever you might want graduate students in library & information science students to know about AI.

For those of you who just want the reading list, here you go. For those of you who thought about the second-to-last sentence: ahahaha.

this is fine dog memeThis is fine.

This is of course the problem of all teachers — too much material, too little time — and in an iSchool it’s further complicated because, while many students have technological interests and expertise, few have programming skills and even fewer have mathematical backgrounds, so this course can’t be “intro to programming neural nets”. I can gesture in the direction of linear algebra and high-dimensional spaces, but I have to translate it all into human English first.

But further, even if I were to do that, it wouldn’t be the right course! As future librarians, very few of my students will be programming neural nets. They are much more likely to be helping students find sources for papers, or helping researchers find or manage data sets, or supporting professors who are developing classes, helping patrons make sense of issues in the news, and evaluating vendor pitches about AI products. Which means I don’t need people who can write neural net code; I need people who understand the basics of how machine learning operates, who can do some critical analysis, situate it in its social context. People who know some things about what data is good for, how it’s hard, where to find it. People who know at least the general direction in which they might find news articles and papers and conferences that their patrons will care about. People who won’t be too dazzled by product hype and can ask pointed questions about how products really work, and whether they respect library values. And, while we’re at it, people who have some sense of what AI can do, not just theoretically, but concretely in real-world library settings.

Eight weeks: go!

What I ended up doing was 4 2-week modules, with a rough alternation of theory and library case studies, and a pretty wild mix of readings: conference presentations, scholarly papers from a variety of disciplines, hilarious computational misadventures, news articles, data visualizations. I mostly kept a lid on the really technical stuff in the required readings, but tossed a lot of it into optional readings, so that students with that background or interest could pull on those threads. (And heavily annotated the optional readings, to give people a sense of what might interest them; I’d like to say this is why surprisingly many of my students did some optional reading, but actually they’re just awesome.) For case studies, we looked at the Northern Illinois University dime novels collection experiments; metadata enrichment in the Charles Teenie Harris archive; my own work with HAMLET; and the University of Rhode Island AI lab. This let us hit a gratifyingly wide variety of machine learning techniques, use cases (metadata, discovery, public services), and settings (libraries, archives).

Do I have a couple of pages of things to change up next time I teach the class (this fall)? Of course I do. But I think it went well for a first-time class (particularly for a first-time class in the middle of a global catastrophe…)

Big ups to the following:

  • Matthew Short of NIU and Bohyun Kim of URI, for guest speaking;
  • Everyone at SJSU who worked on their “how to teach online” materials, especially Debbie Faires — their onboarding did a good job of conveying SJSU-specific expectations and building a toolkit for teaching specifically online in a way that was useful to me as someone with a lot of offline teaching experience;
  • Zeynep Tufecki, Momin Malik, Catherine D’Ignazio, who suggested readings that I ended up assigning;
  • and my students, who are about to get a paragraph.

My students. Look. You signed up to take a class online — it’s an all-online program — but none of you signed up to do it while being furloughed, while homeschooling, while being sick with a scary new virus. And you knocked it out of the park. Week after week, asking for the smallest of extensions to hold it all together, breaking my heart in private messages, while publicly writing thoughtful, well-researched, footnoted discussion posts. While not only doing even the optional readings, but finding astonishment and joy in them. While piecing together the big ideas about data and bias and fairness and the genuine alienness of machine intelligence. I know for certain, not as an article of faith but as a statement of fact, that I will keep seeing your names out there, that your careers will go places, and I hope I am lucky enough to meet you in person someday.

Sun and Moon / Ed Summers

Since the Coronavirus stay-at-home order went into effect we’ve allowed Maeve to play outside in our yard with one friend, a neighbor named Gracyn. We’ve been careful not to allow Gracyn into our house, and sometimes Maeve will go over to Gracyn’s house to play in her yard. I guess this is breaking the rules, but they have only seen each other, and our families have stayed in touch through messaging to let each other know if anyone has felt even the least bit sick. Luckily so far we’ve all been healthy.

So Maeve and Gracyn have seen each other pretty much every day for the past month and a half, when the weather has been nice. There have even been some drizzly days when they played outside in the rain. As Kesa and I have been doing work they’ve played together, requiring very little attention from us, apart from the occasional snack or drink of water.

They spend most of the time riding their bikes around our little cul-de-sac, climbing a tree, and (most of all) playing pretend with their stuffed animals. They’ve been riding around on their bikes with the stuffed animals in a basket. I’ve seen the stuffed animals arranged in various configurations in the yard, and in a wagon, and perched in the tree.

Yesterday I found this piece of paper with Maeve’s handwriting on it. When I asked her about it she said that her and Gracyn wrote it together. I’ve included a picture below, but here is a transcription. At first I added some periods in to try to massage it into sentences, but that disturbed the flow of the words. So I left it how she wrote it, after adjusting some of her 8 year old spelling.

In the dark to the light
the light that is very powerful
that can say hello to us. But darkness
can be rational–But it doesn’t
mean we should give up the stuffed-animals
are the best part of this world Stuffed animal land
the stuffed animals can do what they
want stuffed animal land yes we can
face things that are challenging but we
don’t give up we are one nation we
face things and we do sing but we
can do good to the ring sun and
the moon. But will never give up
and that is a law in this world
of sun and moon sun and moon
sun and moon sun and moon.

I feel very, very lucky that we have a small yard for them to play in, and that we have a little area for them to ride their bikes. Several people have said to me that it must be hard looking after children during this time, with all the changes to schooling, and the multiple Zoom calls going on in our small house. But honestly it feels like the children are looking after us.

It’s time to cut the CRAAP / Mita Williams

I do not have a good understanding of what academic librarians are currently teaching students in regards to evaluating information they find on the Internet. Rather than read the literature, I searched for the word CRAAP in my custom Google Search Engine for Ontario Academic Libraries. I found that many libraries – including my own place of work – advocate the use of the CRAAP checklist-approach to evaluating information found online.

I have never been particularly enthusiastic about the CRAAP checklist approach to evaluating information and I know that I’m not the only librarian who feels this way. But until recently, if you had asked me what I would suggest as an alternative, I would have struggled to articulate the structure of what to replace it.

As my last series of posts can attest, I have been recently creating creative-commons licensed learning objects with H5P through eCampus Ontario. I am doing so because in these unprecedented times much of the teaching on the university campus has transitioned to asynchronous online learning and as such, I believe that my teaching should transition as well.

This week, I made this short presentation introducing the reader to two methods that I think should replace the use of the CRAAP checklist.

This presentation introduces the reader to the COR (Civic Online Reasoning) Curriculum and the SIFT Method. Both are comprised of a short series of steps to help the reader separate fact from fiction on the Internet. Both methods are built from the strategies employed by professional fact-checkers.

Mike Caulfield, who created and advocates for the SIFT method, has explained why the CRAAP checklist is insufficient in these two interviews that are best read in full: “Getting Beyond the CRAAP Test: A Conversation with Mike Caulfield” and “Truth Is in the Network” from Project Information Literacy.

I also found his post, A Short History of CRAAP as particularly enlightening. My jaw dropped a bit at this particular connection:

So when the web came into being, library staff, tasked with teaching students web literacy, began to teach students how to use collection development criteria they had learned in library science programs. The first example of this I know of is Tate & Alexander’s 1996 paper which outlines a lesson plan using the “traditional evaluation criteria of accuracy, authority, objectivity, currency, and coverage.” ….

… So let’s keep that in mind as we consider what to do in the future: contrary to public belief we did teach students online information literacy. It’s just that we taught them methodologies that were developed to decide whether to purchase reference sets for libraries

A Short History of CRAAP

Perhaps this is the reason why librarians have such a hard time letting go of this particular approach.

NDSA Welcomes Three New Members! / Digital Library Federation

As of April 2020, the NDSA Leadership unanimously voted to welcome its three most recent applicants into the membership.

  • University of Dayton
  • Rhode Island School of Design (RISD)
  • University of California, Berkeley Library

This brings the total membership to number over 250 members! Each of these new members brings a host of skills and experience to our group. Dayton works actively to support scholarship teaching and learning; RISD has over 18TB of digital materials and has an IMLS grant to implement a new DAMS; Berkeley, led by Coordinating Committee Member, Salwa Ismail, has been solidifying its born digital workflows and is developing a web-archiving program. 

Each of these organizations has new additions to our various interest and working groups – so keep an eye out for them on your calls and be sure to give them a shout out. Please join me in welcoming our new members. To review our list of members, you can see them here.


The post NDSA Welcomes Three New Members! appeared first on DLF.

LITA/ALA Survey of Library Response to COVID-19 / LITA

The Library and Information Technology Association (LITA) and its ALA partners are seeking a new round of feedback about the work of libraries as they respond to the COVID-19 crisis, releasing a survey and requesting feedback by 11:59 p.m. CDT, Monday, May 18, 2020. Please complete the survey by clicking on the following link:

LITA and its ALA partners know that libraries across the United States are taking unprecedented steps to answer the needs of their communities, and this survey will help build a better understanding of those efforts. LITA and its ALA partners will use the results to advocate on behalf of libraries at the national level, communicate aggregated results with the public and media, create content and professional development opportunities to address library staff needs, and share some raw, anonymized data elements with state-level staff and library support organizations for their own advocacy needs. 

Additional information about the survey: All library types are encouraged to respond. We are surveying at the library organizational level and are not collecting outlet/branch data. Financial-related data will only be used in aggregate and not shared in raw data format. Any raw data that is shared with states or other library support organizations outside of ALA will be anonymized. The survey should take about 15 minutes to complete, and all completed respondents will be automatically entered to win one of ten $30 gift certificates to the ALA Store.

Special thanks to the Colorado State Library’s Library Research Service and the Institute of Museum and Library Services, Public Library Data Alliance partners, Association of Research Libraries, and the ACRL Academic Library Trends & Statistics Survey Editorial Board. Additional information about the survey can be found at:

Islandora Online: Topics and Call for Proposals / Islandora

Islandora Online: Topics and Call for Proposals manez Tue, 05/12/2020 - 18:11

Thank you very much to everyone who took part in our topics survey for Islandora Online. The topics will be:

Islandora 8 - General information and training

Metadata - Moving it, cleaning it up, and translating it into Islandora 8

Migrations - Moving into Islandora 8 from Islandora 7 and more

Institutional Repositories - Meeting IR needs with Islandora 8


Have something cool to share under these headings? Please send it in to our Call for Proposals: The CFP will be open until June 10.

Islandora Online will take the form of four online events, around four hours each, focused on a specific topic of interest to the islandora community, Each event will have a mix of presentations, panel discussion, and small group discussion, with some optional “social” components to break up the time. Regular sessions should be roughly 20 minutes (plus time for questions), Panels run for around 40 minutes, and Lightning Talks are five minutes each.

Islandora Online will be held over July 2020.

Discussing agriculture, rainfall and climate data in Ghana: Open Data Day 2020 report / Open Knowledge Foundation

On Saturday 7th March 2020, the tenth Open Data Day took place with people around the world organising over 300 events to celebrate, promote and spread the use of open data. Thanks to generous support from key funders, the Open Knowledge Foundation was able to support the running of more than 60 of these events via our mini-grants scheme

This blogpost is a report from the Department of Agriculture at the Asuogyaman District Assembly in Ghana who received funding from Resource Watch to host local farming organisations to create awareness on the need for data to be open and to show the effect of climate change on agriculture and related livelihoods using rainfall data.

Open Data Day 2020 was marked in Atimpoku, Ghana on Saturday 7th March 2020 at the Assembly Hall of the Asuogyaman District Assembly.

Organised by the Department of Agriculture in collaboration with Rite 90.1 FM and other stakeholders, there were four presentations under a thematic area of environmental data. The event to commemorate the day started at 9am with a presentation by David Dokli, the district director of agriculture, who threw more light on the significance of open data and the day as a whole. His presentation continued on monthly rainfall data of Asuogyaman District from two locations of Akosombo and Boso for 2018 and 2019, courtesy of the Ghana Meteorological Agency. The presentation elaborated on the impact of these rainfall data on agriculture and climate change and their relevance.

The second presentation was by Gertrude Mongkuma, the District Planning Officer for Asuogyaman, on the Emergency Preparedness Plan (EPP) of the Volta River Authority and other stakeholders in there is a dam break. 

Thirdly, Doris Owiiredu of the Fisheries Commission did a presentation on the impact of small scale fisheries on the environment and the type of data collected by her outfit. 

The final presentation was by Etornyo Agbeko PhD of the Water Research Institute of the Council for Scientific and Industrial Research, who led a discussion on climate change and the effects it will have on the water cycle. 

The event ended at 1pm with participants resolving to advocate for the need for data – especially on very important developmental issues – to be open. This event was sponsored by Open Knowledge Foundation through a mini-grant scheme. Open Data Day is celebrated throughout the world annually and this is the tenth in series.

Open data for accountability in Burkina Faso: Open Data Day 2020 report / Open Knowledge Foundation

On Saturday 7th March 2020, the tenth Open Data Day took place with people around the world organising over 300 events to celebrate, promote and spread the use of open data. Thanks to generous support from key funders, the Open Knowledge Foundation was able to support the running of more than 60 of these events via our mini-grants scheme

This blogpost is a report by Abdoul Aziz Traore from Association SUUDU ANDAL in Burkina Faso who received funding from the Foreign and Commonwealth Office to plan to emphasise the importance of open data for development and accountability during their event.

Suudu Andal marks Open Data Day 2020 in Burkina Faso

On Saturday 7th March 2020, a fireside chat was held on open data with Malick Lingani and Abdoul Aziz Traore, thanks to the support of the Open Knowledge Foundation. The event was initiated under the leadership of Suudu Andal, an organisation focused on good governance and the Sustainable Development Goals. 

30 people including 20 girls, leaders of different organisations and structures were present at this event. We discussed how leaders can make data available to organisations to inform development and enable different stakeholders to be more effective in their interventions. Then we also stressed the importance of open data for accountability and development.

Suudu Andal marks Open Data Day 2020 in Burkina Faso

Finally, we exchanged views on how to make data available for transparency and good governance. It should be noted that this event had a special touch because it brought together young people who had little notion of open data.

After the event, many committed to promoting data provision for equitable and inclusive development.  The fireside chat consisted of a total of 80 questions and lasted two and a half hours. One key question was: what role can youth play to make open contracting a reality in Burkina Faso given its importance for development?

What's New / Islandora

What's New manez Tue, 05/12/2020 - 14:05

Our website has been overhauled in a big way. We have moved to Drupal 8, changed our look, and shifted content around to make it easier to find the Islandora information and resources that you need. Can't find something you expect from the old site? Let us know and we'll get it fixed.


A screenshot of the old


Editorial / Code4Lib Journal

An abundance of information sharing.

Leveraging Google Drive for Digital Library Object Storage / Code4Lib Journal

This article will describe a process at the University of Kentucky Libraries for utilizing an unlimited Google Drive for Education account for digital library object storage. For a number of recent digital library projects, we have used Google Drive for both archival file storage and web derivative file storage. As a part of the process, a Google Drive API script is deployed in order to automate the gathering of of Google Drive object identifiers. Also, a custom Omeka plugin was developed to allow for referencing web deliverable files within a web publishing platform via object linking and embedding. For a number of new digital library projects, we have moved toward a small VM approach to digital library management where the VM serves as a web front end but not a storage node. This has necessitated alternative approaches to storing web addressable digital library objects. One option is the use of Google Drive for storing digital objects. An overview of our approach is included in this article as well as links to open source code we adopted and more open source code we produced.

Building a Library Search Infrastructure with Elasticsearch / Code4Lib Journal

This article discusses our implementation of an Elastic cluster to address our search, search administration and indexing needs, how it integrates in our technology infrastructure, and finally takes a close look at the way that we built a reusable, dynamic search engine that powers our digital repository search. We cover the lessons learned with our early implementations and how to address them to lay the groundwork for a scalable, networked search environment that can also be applied to alternative search engines such as Solr.

How to Use an API Management platform to Easily Build Local Web Apps / Code4Lib Journal

Setting up an API management platform like DreamFactory can open up a lot of possibilities for potential projects within your library. With an automatically generated restful API, the University Libraries at Virginia Tech have been able to create applications for gathering walk-in data and reference questions, public polling apps, feedback systems for service points, data dashboards and more. This article will describe what an API management platform is, why you might want one, and the types of potential projects that can quickly be put together by your local web developer.

Git and GitLab in Library Website Change Management Workflows / Code4Lib Journal

Library websites can benefit from a separate development environment and a robust change management workflow, especially when there are multiple authors. This article details how the Oakland University William Beaumont School of Medicine Library use Git and GitLab in a change management workflow with a serverless development environment for their website development team. Git tracks changes to the code, allowing changes to be made and tested in a separate branch before being merged back into the website. GitLab adds features such as issue tracking and discussion threads to Git to facilitate communication and planning. Adoption of these tools and this workflow have dramatically improved the organization and efficiency of the OUWB Medical Library web development team, and it is the hope of the authors that by sharing our experience with them others may benefit as well.

Experimenting with a Machine Generated Annotations Pipeline / Code4Lib Journal

The UCLA Library reorganized its software developers into focused subteams with one, the Labs Team, dedicated to conducting experiments. In this article we describe our first attempt at conducting a software development experiment, in which we attempted to improve our digital library’s search results with metadata from cloud-based image tagging services. We explore the findings and discuss the lessons learned from our first attempt at running an experiment.

Leveraging the RBMS/BSC Latin Place Names File with Python / Code4Lib Journal

To answer the relatively straight-forward question “Which rare materials in my library catalog were published in Venice?” requires an advanced knowledge of geography, language, orthography, alphabet graphical changes, cataloging standards, transcription practices, and data analysis. The imprint statements of rare materials transcribe place names more faithfully as it appears on the piece itself, such as Venetus, or Venetiae, rather than a recognizable and contemporary form of place name, such as Venice, Italy. Rare materials catalogers recognize this geographic discoverability and selection issue and solve it with a standardized solution. To add consistency and normalization to imprint locations, rare materials catalogers utilize hierarchical place names to create a special imprint index. However, this normalized and contemporary form of place name is often missing from legacy bibliographic records. This article demonstrates using a traditional rare materials cataloging aid, the RBMS/BSC Latin Place Names File, with programming tools, Jupyter Notebook and Python, to retrospectively populate a special imprint index for 17th-century rare materials. This methodology enriched 1,487 MAchine Readable Cataloging (MARC) bibliographic records with hierarchical place names (MARC 752 fields) as part of a small pilot project. This article details a partially automated solution to this geographic discoverability and selection issue; however, a human component is still ultimately required to fully optimize the bibliographic data.

Tweeting Tennessee’s Collections: A Case Study of a Digital Collections Twitterbot Implementation / Code4Lib Journal

This article demonstrates how a Twitterbot can be used as an inclusive outreach initiative that breaks down the barriers between the web and the reading room to share materials with the public. These resources include postcards, music manuscripts, photographs, cartoons and any other digitized materials. Once in place, Twitterbots allow physical materials to converge with the technical and social space of the Web. Twitterbots are ideal for busy professionals because they allow librarians to make meaningful impressions on users without requiring a large time investment. This article covers the recent implementation of a digital collections bot (@UTKDigCollBot) at the University of Tennessee, Knoxville (UTK), and provides documentation and advice on how you might develop a bot to highlight materials at your own institution.

Building Strong User Experiences in LibGuides with Bootstrapr and Reviewr / Code4Lib Journal

With nearly fifty subject librarians creating LibGuides, the LibGuides Management Team at Notre Dame needed a way to both empower guide authors to take advantage of the powerful functionality afforded by the Bootstrap framework native to LibGuides, and to ensure new and extant library guides conformed to brand/identity standards and the best practices of user experience (UX) design. To accomplish this, we developed an online handbook to teach processes and enforce styles; a web app to create Twitter Bootstrap components for use in guides (Bootstrapr); and a web app to radically speed the review and remediation of guides, as well as better communicate our changes to guide authors (Reviewr). This article describes our use of these three applications to balance empowering guide authors against usefully constraining them to organizational standards for user experience. We offer all of these tools as FOSS under an MIT license so that others may freely adapt them for use in their own organization.

IIIF by the Numbers / Code4Lib Journal

The UCLA Library began work on building a suite of services to support IIIF for their digital collections. The services perform image transformations and delivery as well as manifest generation and delivery. The team was unsure about whether they should use local or cloud-based infrastructure for these services, so they conducted some experiments on multiple infrastructure configurations and tested them in scenarios with varying dimensions.

Trust, But Verify: Auditing Vendor-Supplied Accessibility Claims / Code4Lib Journal

Despite a long-overdue push to improve the accessibility of our libraries’ online presences, much of what we offer to our patrons comes from third party vendors: discovery layers, OPACs, subscription databases, and so on. We can’t directly affect the accessibility of the content on these platforms, but rely on vendors to design and test their systems and report on their accessibility through Voluntary Product Accessibility Templates (VPATS). But VPATs are self-reported. What if we want to verify our vendors’ claims? We can’t thoroughly test the accessibility of hundreds of vendor systems, can we? In this paper, we propose a simple methodology for spot-checking VPATs. Since most websites struggle with the same accessibility issues, spot checking particular success criteria in a library vendor VPAT can tip us off to whether the VPAT as a whole can be trusted. Our methodology combines automated and manual checking, and can be done without any expensive software or complex training. What’s more, we are creating a repository to share VPAT audit results with others, so that we needn’t all audit the VPATs of all our systems.

A public contracting data camp in Bolivia: Open Data Day 2020 report / Open Knowledge Foundation

On Saturday 7th March 2020, the tenth Open Data Day took place with people around the world organising over 300 events to celebrate, promote and spread the use of open data. Thanks to generous support from key funders, the Open Knowledge Foundation was able to support the running of more than 60 of these events via our mini-grants scheme

This blogpost is a report from CONSTRUIR Foundation in Bolivia who received funding from Hivos to organise a data camp to advocate for more and better public contracting data.

Bolivia is one of the two South American countries (the other is Venezuela) that has not debated, approved and promulgated an Access to Public Information Law. The right of access to public information allows other states to promote a culture of open information and mitigates the culture of secrecy, it also provides better guarantees for the exercise of citizen rights.

We feel that Bolivia’s public procurement system is a mechanism that the state should use to make transparent the use of public resources and get closer with citizens establishing instruments of access to information that make public management more efficient and public officers more responsible towards citizens. The resources destined for Public Purchases and Procurement should be published in open data formats and ensuring their accessibility for different types of audiences.

However, the Bolivian public procurement portal does not release standardised data, the information generated on public procurement is concentrated in the procurement system SICOES and the data that incorporated in the website was published in any format not allowing to be analysed immediately.

According to SICOES, 2018 saw the publication of 32,248 public procurement tenders for works, services and consulting published on the website of which 4,144 were contracted. In addition, the Ministry of Economy and Public Finance states that 26% of the total spending in the 2018 General State Budget was on services adding up to a total of 12 million bolivianos, equivalent to $1,753,920 USD.

For Open Data Day, CONSTRUIR Foundation promoted our Data Camp as an event to open public contracting at the municipality level. The first part of the event included presentations about opening public procurement data and Wikipedia projects to provide a common source of open data that can be used and stored in Wikidata.

In the next part of the event, the participants worked in the Data Camp methodological approach to create collaborative groups to open and analyse data on public contracting purchases from different municipalities. Once in groups, the 47 participants used the Open Contracting Data Standard to structure available public procurement data from six municipalities in the Department of La Paz in Bolivia. 

As a result of the activities, the participating youth organisations exchanged experiences and strengthened knowledge about open data, the right of access to public information, public procurement and generating tools for opening public contracts for the social control of public spending. 

Finally, the databases resulting from the opening process of public procurement at the municipal level will be part of the Municipal Public Procurement Observatory of Bolivia promoted by the CONSTRUIR Foundation and which will be available to be consulted in May on the website: