September 21, 2014

Peter Sefton

Digital Object Pattern (DOP) vs chucking files in a database, approaches to repository design

At work, in the eResearch team at the University of Western Sydney we’ve been discussing various software options for working-data repositories for research data, and holding a series of ‘tool days’; informal hack-days where we try out various software packages. For the last few months we’ve been looking at “working-data repository” software for researchers in a principled way, searching for one or more perfect Digital Object Repositories for Academe (DORAs).

One of the things I’ve been ranting on about is the flexibility of the “Digital Object Pattern”, (we always need more acronyms, so lets call it DOP) for repository design, as implemented by the likes of ePrints, DSpace, Omeka, CKAN and many of the Fedora Commons based repository solutions. At its most basic, this means a repository that is built around a core set of objects (which might be called something like an Object, an ePrint, an Item, or a Data Set depending on which repository you’re talking to). These Digital Objects have:

Basic DOP Pattern

Basic DOP Pattern

There are infinite ways to model a domain but this is a tried-and-tested pattern which is worth exploring for any repository, if only because it’s such a common abstraction that lots of protocols and user interface conventions have grown up around it.

I found this discussion of the Digital Object used in a CNRI, Digital Object Repository Server (DORS), obviously a cousin of DORA.

This data structure allows an object to have the following:

  • a set of key-value attributes that describe the object, one of which is the object’s identifier

  • a set of named ‘data elements’ that hold potentially large byte sequences (analogous to one or more data files)

  • a set of key-value attributes for each of the data elements

This relatively simple data structure allows for the simple case, but is sufficiently flexible and extensible to incorporate a wide variety of possible structures, such as an object with extensive metadata, or a single object which is available in a number of different formats. This object structure is general enough that existing services can easily map their information-access paradigm onto the structure, thus enhancing interoperability by providing a common interface across multiple and diverse information and storage systems. An example application of the DO data model is illustrated in Figure 1.

To the above list of features and advantages I’d add a couple of points on how to implement the ideal Digital Object repository:

(I’m trying to keep this post reasonably short, but just quickly, another really good repository pattern that complements DOP is to keep separate the concerns of Storing Stuff from Indexing Stuff for Search and Browse. That is, the Digital Objects should be stashed somewhere with all their metadata and data, and no matter what metadata type you’re using you build one or more discovery indexes from that. This is worth mentioning because as soon as some people see RDF they immediately think Triple Store, OK, but for repository design I think it’s more helpful to think Triple Index. That is, treat the RDF reasoner, SPARQL query endpoint etc as a separate concern from repositing.)

The DOP contrasts with a file-centric pattern where every file is modelled separately, with it’s own metadata, which is the approach taken by HIEv, the environmental science Working Data Repository we looked at last week. Theoretically, this gives you infinite flexibility but in practice it makes it harder to build a usable data repository.

Files as primary repository objects

Files as primary repository objects

Once your repository starts having a lot of stuff in it like image thumbnails, derived files like OCRed text, and transcoded versions of files (say from the proprietary TOA5 format into NETCDF) then you’re faced with the challenge of indexing them all, for search and browse in a way that they appear to clump together. I think that as HIEv matures and more and more relationships between files become important then we’ll probably want to add container objects that automatically bundle together all the related bits and pieces to do with a single ‘thing’ in the repository. For example, a time series data set may have the original proprietary file format, some rich metadata, a derived file in a standard format, a simple plot to preview the file contents, and re-sampled data set at lower resolution, all of which really have more or less the same metadata about where they came from, when, and some shared utility. So, we’ll probably end up with something like this:

Adding an abstraction to group files into Objects (once the UI gets unmanageable)

Adding an abstraction to group files into Objects (once the UI gets unmanageable)

Draw a box around that and what have you got?

The Digital Object Pattern, that’s what, albeit probably implemented in a somewhat fragile way.

With the DOP, as with any repository implementation pattern you have to make some design decisions. Gerry Devine asked at our tools day this week, what do you do about data-items that are referenced by multiple objects?

First of all, it is possible for one object to reference another, or data elements in another, but if there’s a lot of this going on then maybe the commonly re-used data elements could be put in their own object. A good example of this is the way WordPress, which is probably where you’re reading this, works. All images are uploaded into a media collection, and then referenced by posts and pages: an image doesn’t ‘belong’ to a document except by association, if the document calls it in. This is a common approach for content management systems, allowing for re-use of assets across objects, but if you were building a museum collection project with a Digital Object for each physical artefact, it might be better for practical reasons to store images of objects as data elements on the object, and other images which might be used for context etc separately as image objects.

Of course if you’re a really hardcore developer you’ll probably want to implement the most flexible possible pattern and put one file per object, with a ‘master object’ to tie them together. This makes development of a usable repository significantly harder. BTW, you can do it using the DOP with one-file per Digital Object, and lots of relationships. Just be prepared for orders of magnitude more work to build a robust, usable system.

Creative Commons License
Digital Object Pattern (DOP) vs chucking files in a database, approaches to repository design is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

by ptsefton at September 21, 2014 11:09 PM


Code4Lib North (Ottawa): Tuesday October 7th, 2014


  • Mark Baker - Principal Architect at Zepheira will provide a brief overview of some of Zepheira’s BibFrame tools in development.
  • Jennifer Whitney - Systems Librarian at MacOdrum Library will present OpenRefine (formerly Google Refine) – a neat and powerful tool for cleaning up messy data.
  • Sarah Simpkin, GIS and Geography Librarian & Catherine McGoveran, Government Information Librarian (both from UOttawa Library) - will team up to present on a recent UOttawa sponsored Open Data Hackfest as well as to introduce you to Open Data Ottawa.

Date: Tuesday October 7th, 2014, 7:00PM (19h00)

Location: MacOdrum Library, Carleton University, 1125 Colonel By Drive, Ottawa, ON, Ottawa, ON (map)

RSVP: You can RSVP on code4lib Ottawa Meetup page

by cbeer at September 21, 2014 06:38 PM

David Rosenthal

Utah State Archives has a problem

A recent discussion on the NDSA mailing list featured discussion about the Utah State Archives struggling with the costs of being forced to use Utah's state IT infrastructure for preservation. Below the fold, some quick comments.

Here's summary of the situation the Archives finds itself in:

we actually have two separate copies of the AIP. One is on m-disc and the other is on spinning disk (a relatively inexpensive NAS device connected to our server, for which we pay our IT department each month). ... We have centralized IT, where there is one big data center and servers are virtualized. Our IT charges us a monthly rate for not just storage, but also all of their overhead to exist as a department. ... and we are required by statute to cooperate with IT in this model, so we can't just go out and buy/install whatever we want. For an archives, that's a problem, because our biggest need is storage but we are funded based upon the number of people we employ, not the quantity of data we need to store, and convincing the Legislature that we need $250,000/year for just one copy of 50 TB of data is a hard sell, never mind additional copies for SIP, AIP, and/or DIP.
Michelle Kimpton, who is in the business of persuading people that using DuraCloud is cheaper and better than doing it yourself, leaped at the opportunity this offered (my emphasis):
If I look at Utah State Archive storage cost, at $5,000 per year per TB vs. Amazon S3 at $370/year/TB it is such a big gap I have a hard time believing that Central IT organizations will be sustainable in the long run.  Not that Amazon is the answer to everything, but they have certainly put a stake in the ground regarding what spinning disk costs, fully loaded( meaning this includes utilities, building and personnel). Amazon S3 also provides 3 copies, 2 onsite and one in another data center.

I am not advocating by any means that S3 is the answer to it all, but it is quite telling to compare the fully loaded TB cost from an internal IT shop vs. the fully loaded TB cost from Amazon.

I appreciate you sharing the numbers Elizabeth and it is great your IT group has calculated what I am guessing is the true cost for managing data locally.
Elizabeth Perkes for the Archives responded:
I think using Amazon costs more than just their fees, because someone locally still has to manage any server space you use in the cloud and make sure the infrastructure is updated. So then you either need to train your archives staff how to be a system administrator, or pay someone in the IT community an hourly rate to do that job. Depending on who you get, hourly rates can cost between $75-150/hour, and server administration is generally needed at least an hour per week, so the annual cost of that service is an additional $3,900-$7,800. Utah's IT rate is based on all costs to operate for all services, as I understand it. We have been using a special billing rate for our NAS device, which reflects more of the actual storage costs than the overhead, but then the auditors look at that and ask why that rate isn't available to everyone, so now IT is tempted to scale that back. I just looked at the standard published FY15 rates, and they have dropped from what they were a couple of years ago. The official storage rate is now .2386/GB/month, which is $143,160/year for 50 TB, or $2,863.20 per TB/year.
But this doesn't get at the fundamental flaws in Michelle's marketing:
As I've pointed out before, Amazon's margins on S3 are enviable. You don't need to be very big to have economies of scale enough to undercut S3, as the numbers from Backblaze demonstrate. The Archive's 50TB is possibly not enough to do this if they were actually managing the data locally.

But the Archive might well employ a strategy similar to that I suggested for the Library of Congress Twitter collection. They already keep a copy on m-disk. Suppose they kept two copies on m-disk as the Library keeps two copies on tape, and regarded that as their preservation solution. Then they could use Amazon's Reduced Redundancy Storage and AWS virtual servers as their access solution. Running frequent integrity checks might take an additional small AWS instance, and any damage detected could be repaired from one of the m-disk copies.

Using the cloud for preservation is almost always a bad idea. Preservation is a base-load activity whereas the cloud is priced as a peak-load product. But the spiky nature of current access to archival collections is ideal for the cloud.

by David. ( at September 21, 2014 04:55 AM

September 20, 2014

John Miedema

“Book Was There” by Andrew Piper. If we’re going to have ebooks that distract us, we might as well have ones that help us analyse too.

Bookwastherepage“I can imagine a world without books. I cannot imagine a world without reading” (Piper, ix). In these last few generations of print there is nothing keeping book lovers from reading print books. Yet with each decade the print book yields further to the digital. But there it is, we are the first few generations of digital, and we are still discovering what that means for reading. It is important to document this transition. In Book Was There: Reading in Electronic Times, Piper describes how the print book is shaping the digital screen and what it means for reading.

Book was there. It is a quote from Gertrude Stein, who understood that it matters deeply where one reads. Piper: “my daughter … will know where she is when she reads, but so too will someone else.” (128) It is a warm promise and an observation that could be ominous, but still being explored for possibilities.

The differences between print and digital are complex, and Piper is not making a case for or against books. The book is a physical container of letters. The print book is “at hand,” a continuous presence, available for daily reference and so capable of reinforcing new ideas. The word, “digital,” comes from “digits” (at least in English), the fingers of the hand. Digital technology is ambient, but could could allow for more voices, more debate. On the other hand, “For some readers the [print] book is anything but graspable. It embodies … letting go, losing control, handing over.” (12)  And internet users are known to flock together, reinforcing what they already believe, ignoring dissent. Take another example. Some criticize the instability of the digital. Turn off the power and the text is gone. Piper counters that digital text is incredibly hard to delete, with immolation of the hard drive being the NSA recommended practice.

Other differences are still debated. There is a basic two-dimensional nature to the book, with pages facing one another and turned. One wonders if this duality affords reflection. Does the return to one-dimensional scrolling of the web page numb the mind? Writing used to be the independent act of one or two writers. Reading was a separate event. Digital works like Wikipedia are written by many contributors, organized into sections. Piper wonders if it possible to have collaborative writing that is also tightly woven like literature? (There is the recent example of 10 PRINT, written by ten authors in one voice.) Books have always been shared, a verb that has its origins in “shearing … an act of forking.” (88) With digital, books can be shared more easily, and readers can publish endings of their own. Books are forked into different versions. Piper cautions that over-sharing can lead to the forking that ended the development of Unix. But we now have the successful Unix. Is there a downside?

Scrolling aside, digital is really a multidimensional media. Text has been rebuilt from the ground up, with numbers first. New deep kinds of reading are becoming possible. Twenty-five years ago a professor of mine lamented that he could not read all the academic literature in his discipline. Today he can. Piper introduces what is being called “distant reading”: the use of big data technologies, natural language processing, and visualization, to analyze the history of literature at the granular level of words. In his research, he calculates how language influences the writing of a book, and how in turn the book changes the language of its time. It measures a book in a way that was never possible with disciplined close reading or speed reading. “If we’re going to have ebooks that distract us, we might as well have ones that help us analyse too.” (148)

Piper embraces the fact that we now have new kinds of reading. He asserts that these practices need not replace the old. Certainly there were always be print books for those of us who love a good slow read. I do think, however, that trade-offs are being made. Books born digital are measurably shorter than print, more suited to quick reading and analysis by numbers. New authors are writing to digital readers. Readers and reading are being shaped in turn. The reading landscape is changing. These days I am doubtful that traditional reading of print books — or even ebooks — will remain a common practice. There it is.

by johnmiedema at September 20, 2014 06:56 PM

September 19, 2014

District Dispatch

“Outside the Lines” at ICMA

Photo of David Singleton, Director of Libraries for the Charlotte Mecklenburg Library, with Public Library Association (PLA) Past President Carolyn Anthony, PLA Director Barb Macikas and PLA President Larry Neal after a tour of ImaginOn.

(From left) David Singleton, Director of Libraries for the Charlotte Mecklenburg Library, with Public Library Association (PLA) Past President Carolyn Anthony, PLA Director Barb Macikas and PLA President Larry Neal after a tour of ImaginOn.

This week, many libraries are inviting their communities to reconnect as part of a national effort called Outside the Lines (September 14-20). Since my personal experience of new acquaintances often includes an exclamation of “I didn’t know libraries did that,” and this experience is buttressed by Pew Internet Project research that finds that only about 23 percent of people who already visit our libraries feel they know all or most of what we do, the need to invite people to rethink libraries is clear.

On the policy front, this also is a driving force behind the Policy Revolution! initiative—making sure national information policy matches the current and emerging landscape of how libraries are serving their communities. One of the first steps is simply to make modern libraries more visible to key decision-makers and influencers.

One of these influential groups, particularly for public libraries, is the International City/County Management Association (ICMA), which concluded its 100th anniversary conference in Charlotte this past week. I enjoyed connecting with city and county managers and their professional staffs over several days, both informally and formally through three library-related presentations.

The Aspen Institute kicked off my conference experience with a preview and discussion of its work emerging from the Dialogue on Public Libraries. Without revealing any details that might diminish the national release of the Aspen Institute report to come in October, I can say it was a lively and engaged discussion with city and county managers from communities of all sizes across the globe. One theme that emerged and resonated throughout the conference was one related to breaking down siloes and increasing collaboration. One participant described this force factor as “one plus one equals three” and referenced the ImaginOn partnership between the Charlotte Mecklenburg Library and the Children’s Theatre of Charlotte.

A young patron enjoys a Sunday afternoon at ImaginOn.

A young patron enjoys a Sunday afternoon at ImaginOn.

While one might think that the level of library knowledge and engagement in the room was perhaps exceptional, throughout my conversations, city and county managers described new library building projects and renovations, efforts to increase local millages, and proudly touted the energy and expertise of the library directors they work with in building vibrant and informed communities. In fact, they sounded amazingly like librarians in their enthusiasm and depth of knowledge!

Dr. John Bertot and I shared findings and new tools from the Digital Inclusion Survey, with a particular focus on how local communities can use the new interactive mapping tools to connect library assets to community demographics and concerns. ICMA is a partner with the American Library Association (ALA) and the University of Maryland Information Policy & Access Center on the survey, which is funded by the Institute of Museum and Library Services (IMLS). Through our presentation (ppt), we explored the components of digital inclusion and key data related to technology infrastructure, digital literacy and programs and services that support education, civic engagement, workforce and entrepreneurship, and health and wellness. Of greatest interest was—again—breaking down barriers…in this case among diverse datasets relating libraries and community priorities.

Finally, I was able to listen in on a roundtable on Public Libraries and Community Building in which the Urban Libraries Council (ULC) shared the Edge benchmarks and facilitated a conversation about how the benchmarks might relate to city/county managers’ priorities and concerns. One roundtable participant from a town of about 3,300 discovered during a community listening tour that the library was the first place people could send a fax; and often where they used a computer and the internet for the first time. How could the library continue to be the “first place” for what comes next in new technology? The answer: you need to have facility and culture willing to be nimble. One part of preparing the facility was to upgrade to a 100 Mbps broadband connection, which has literally increased traffic to this community technology hub as people drive in with their personal devices.

I was proud to get Outside the Lines at the ICMA conference, and am encouraged that so many of these city and county managers already had “met” the 21st century library and were interested in working together for stronger cities, towns, counties and states. Thanks #ICMA14 for embracing and encouraging library innovation!

The post “Outside the Lines” at ICMA appeared first on District Dispatch.

by Larra Clark at September 19, 2014 09:14 PM

FOSS4Lib Recent Releases

Evergreen - 2.5.7-rc1

Last updated September 19, 2014. Created by Peter Murray on September 19, 2014.
Log in to edit this page.

Release Date: 
Friday, September 5, 2014

by Peter Murray at September 19, 2014 08:28 PM

Evergreen - 2.6.3

Last updated September 19, 2014. Created by Peter Murray on September 19, 2014.
Log in to edit this page.

Release Date: 
Friday, September 5, 2014

by Peter Murray at September 19, 2014 08:27 PM

Evergreen - 2.7.0

Last updated September 19, 2014. Created by Peter Murray on September 19, 2014.
Log in to edit this page.

Release Date: 
Thursday, September 18, 2014

by Peter Murray at September 19, 2014 08:27 PM

FOSS4Lib Upcoming Events

Fedora 4.0 in Action at The Art Institute of Chicago and UCSD

Wednesday, October 15, 2014 - 13:00 to 14:00

Last updated September 19, 2014. Created by Peter Murray on September 19, 2014.
Log in to edit this page.

Presented by: Stefano Cossu, Data and Application Architect, Art Institute of Chicago and Esmé Cowles, Software Engineer, University of California San Diego
Join Stefano and Esmé as they showcase new pilot projects built on Fedora 4.0 Beta at the Art Institute of Chicago and the University of California San Diego. These projects demonstrate the value of adopting Fedora 4.0 Beta and taking advantage of new features and opportunities for enhancing repository data.

by Peter Murray at September 19, 2014 08:16 PM


Talk Like a Pirate – library metadata speaks

Pirate Hunter, Richard Zacks

Pirate Hunter, Richard Zacks

Friday, 19 September is of course well known as International Talk Like a Pirate Day. In order to mark the day, we created not one but FIVE lists (rolled out over this whole week). This is part of our What In the WorldCat? series (#wtworldcat lists are created by mining data from WorldCat in order to highlight interesting and different views of the world’s library collections).

If you have a suggestion something you’d like us to feature, let us know or leave a comment below.

About Merrilee Proffitt

by Merrilee at September 19, 2014 07:32 PM

FOSS4Lib Upcoming Events

VuFind Summit 2014

Monday, October 13, 2014 - 08:00 to Tuesday, October 14, 2014 - 17:00

Last updated September 19, 2014. Created by Peter Murray on September 19, 2014.
Log in to edit this page.

This year's VuFind Summit will be held on October 13-14 at Villanova University (near Philadelphia).

Registration for the two-day event is $40 and includes both morning refreshments and a full lunch for both days.

It is not too late to submit a talk proposal and, if accepted, have your registration fee waived.

by Peter Murray at September 19, 2014 07:18 PM

State Library of Denmark

Sparse facet caching

As explained in Ten times faster, distributed faceting in standard Solr is two-phase:

  1. Each shard performs standard faceting and returns the top limit*1.5+10 terms. The merger calculates the top limit terms. Standard faceting is a two-step process:
    1. For each term in each hit, update the counter for that term.
    2. Extract the top limit*1.5+10 terms by running through all the counters with a priority queue.
  2. Each shard returns the number of occurrences of each term in the top limit terms, calculated by the merger from phase 1. This is done by performing a mini-search for each term, which takes quite a long time. See Even sparse faceting is limited for details.
    1. Addendum: If the number for a term was returned by a given shard in phase 1, that shard is not asked for that term again.
    2. Addendum: If the shard returned a count of 0 for any term as part of phase 1, that means is has delivered all possible counts to the merger. That shard will not be asked again.

Sparse speedup

Sparse faceting speeds up phase 1 step 2 by only visiting the updated counters. It also speeds up phase 2 by repeating phase 1 step 1, then extracting the counts directly for the wanted terms. Although it sounds heavy to repeat phase 1 step 1, the total time for phase 2 for sparse faceting is a lot lower than standard Solr. But why repeat phase 1 step 1 at all?


Today, caching of the counters from phase 1 step 1 was added to Solr sparse faceting. Caching is tricky business to get just right, especially since the sparse cache must contain a mix of empty counters (to avoid re-allocation of large structures on the Java heap) as well as filled structures (from phase 1, intended for phase 2). But theoretically, it is simple: When phase 1 step 1 is finished, the counter structure is kept and re-used in phase 2. So time for testing:

15TB index / 5B docs / 2565GB RAM, faceting on 6 fields, facet limit 25

15TB index / 5B docs / 2565GB RAM, faceting on 6 fields, facet limit 25, unwarmed queries

Note that there are no measurements of standard Solr faceting in the graph. See the Ten times faster article for that. What we have here are 4 different types of search:


For 1-1000 hits, nocache is actually a bit faster than cache. The peculiar thing about this hit-range is that chances are high that all shards returns all possible counts (phase 2 addendum 2), so phase 2 is skipped for a lot of searches. When phase 2 is skipped, this means wasted caching of a filled counter structure, that needs to be either cleaned for re-use or discarded if the cache is getting too big. This means a bit of overhead.

For more than 1000 hits, cache wins over nocache. Filter through the graph noise by focusing on the medians. As the difference between cache and nocache is that the base faceting time is skipped with cache, the difference of their medians should be the about the same as the difference of the medians from no_facet and skip. Are they? Sorta-kinda. This should be repeated with a larger sample.


Caching with distributed faceting means a small performance hit in some cases and a larger performance gain in other. Nothing Earth-shattering and as it works best when there is more memory allocated for caching, it is not clear in general whether it is best to use it or not. Download a Solr sparse WAR from GitHub and try for yourself.

by Toke Eskildsen at September 19, 2014 02:40 PM

Library of Congress: The Signal

Emerging Collaborations for Accessing and Preserving Email

The following is a guest post by Chris Prom, Assistant University Archivist and Professor, University of Illinois at Urbana-Champaign.

I’ll never forget one lesson from my historical methods class at Marquette University.  Ronald Zupko–famous for his lecture about the bubonic plague and a natural showman–was expounding on what it means to interrogate primary sources–to cast a skeptical eye on every source, to see each one as a mere thread of evidence in a larger story, and to remember that every event can, and must, tell many different stories.

He asked us to name a few documentary genres, along with our opinions as to their relative value.  We shot back: “Photographs, diaries, reports, scrapbooks, newspaper articles,” along with the type of ill-informed comments graduate students are prone to make.  As our class rattled off responses, we gradually came to realize that each document reflected the particular viewpoint of its creator–and that the information a source conveyed was constrained by documentary conventions and other social factors inherent to the medium underlying the expression. Settling into the comfortable role of skeptics, we noted the biases each format reflected.  Finally, one student said: “What about correspondence?”  Dr Zupko erupted: “There is the real meat of history!  But, you need to be careful!”


Dangerous Inbox by Recrea HQ. Photo courtesy of Flickr through a CC BY-NC-SA 2.0 license.

Letters, memos, telegrams, postcards: such items have long been the stock-in-trade for archives.  Historians and researchers of all types, while mindful of the challenges in using correspondence, value it as a source for the insider perspective it provides on real-time events.   For this reason, the library and archives community must find effective ways to identify, preserve and provide access to email and other forms of electronic correspondence.

After I researched and wrote a guide to email preservation (pdf) for the Digital Preservation Coalition’s Technology Watch Report series, I concluded that the challenges are mostly cultural and administrative.

I have no doubt that with the right tools, archivists could do what we do best: build the relationships that underlie every successful archival acquisition.  Engaging records creators and donors in their digital spaces, we can help them preserve access to the records that are so sorely needed for those who will write histories.  But we need the tools, and a plan for how to use them.  Otherwise, our promises are mere words.

For this reason, I’m so pleased to report on the results of a recent online meeting organized by the National Digital Stewardship Alliance’s Standards and Practices Working Group.  On August 25, a group of fifty-plus experts from more than a dozen institutions informally shared the work they are doing to preserve email.

For me, the best part of the meeting was that it represented the diverse range of institutions (in terms of size and institutional focus) that are interested in this critical work. Email preservation is not something of interest only to large government archives,or to small collecting repositories, but also to every repository in between. That said, the representatives displayed a surprising similar vision for how email preservation can be made effective.

Robert Spangler, Lisa Haralampus, Ken  Hawkins and Kevin DeVorsey described challenges that the National Archives and Records Administration has faced in controlling and providing access to large bodies of email. Concluding that traditional records management practices are not sufficient to task, NARA has developed the Capstone approach, seeking to identify and preserve particular accounts that must be preserved as a record series, and is currently revising its transfer guidance.  Later in the meeting, Mark Conrad described the particular challenge of preserving email from the Executive Office of the President, highlighting the point that “scale matters”–a theme that resonated across the board.

The whole account approach that NARA advocates meshes well with activities described by other presenters.  For example, Kelly Eubank from North Carolina State Archives and the EMCAP project discussed the need for software tools to ingest and process email records while Linda Reib from the Arizona State Library noted that the PeDALS Project is seeking to continue their work, focusing on account-level preservation of key state government accounts.

Functional comparison of selected email archives tools/services. Courtesy Wendy Gogel.

Functional comparison of selected email archives tools/services. Courtesy Wendy Gogel.

Ricc Ferrante and Lynda Schmitz Fuhrig from the Smithsonian Institution Archives discussed the CERP project which produced, in conjunction with the EMCAP project, an XML schema for email objects among its deliverables. Kate Murray from the Library of Congress reviewed the new email and related calendaring formats on the Sustainability of Digital Formats website.

Harvard University was up next.  Andrea Goethels and Wendy Gogel shared information about Harvard’s Electronic Archiving Service.  EAS includes tools for normalizing email from an account into EML format (conforming to the Internet Engineering Task Force RFC 2822), then packaging it for deposit into Harvard’s digital repository.

One of the most exciting presentations was provided by Peter Chan and Glynn Edwards from Stanford University.  With generous funding from the National Historical Publications and Records Commission, as well as some internal support, the ePADD Project (“Email: Process, Appraise, Discover, Deliver”) is using natural language processing and entity extraction tools to build an application that will allow archivists and records creators to review email, then process it for search, display and retrieval.  Best of all, the web-based application will include a built-in discovery interface and users will be able to define a lexicon and to provide visual representations of the results.  Many participants in the meeting commented that the ePADD tools may provided a meaningful focus for additional collaborations.  A beta version is due out next spring.

In the discussion that followed the informal presentations, several presenters congratulated the Harvard team on a slide Wendy Gogel shared, comparing the functions provided by various tools and services (reproduced above).

As is apparent from even a cursory glance at the chart, repositories are doing wonderful work—and much yet remains.

Collaboration is the way forward. At the end of the discussion, participants agreed to take three specific steps to drive email preservation initiatives to the next level: (1) providing tool demo sessions; (2) developing use cases; and (3) working together.

The bottom line: I’m more hopeful about the ability of the digital preservation community to develop an effective approach toward email preservation than I have been in years.  Stay tuned for future developments!

by Kate Murray at September 19, 2014 01:02 PM


Tech Yourself Before You Wreck Yourself – Vol. 1

RobotArt from Cécile Graat

This post is for all the tech librarian caterpillars dreaming of one day becoming empowered tech butterflies. The internet is full to the brim with tools and resources for aiding in your transformation (and your job search). In each installment of Tech Yourself Before You Wreck Yourself – TYBYWY, pronounced tie-buy-why – I’ll curate a small selection of free courses, webinars, and other tools you can use to learn and master technologies.  I’ll also spotlight a presentation opportunity so that you can consider putting yourself out there- it’s a big, beautiful community and we all learn through collaboration.

MOOC of the Week -

Allow me to suggest you enroll in The Emerging Future: Technology Issues and Trends, a MOOC offered by the School of Information at San Jose State University through Canvas. Taking a Futurist approach to technology assessment, Sue Alman, PhD offers participants an opportunity to learn “the planning skills that are needed, the issues that are involved, and the current trends as we explore the potential impact of technological innovations.”

Sounds good to this would-be Futurist!

Worthwhile Webinars –

I live in the great state of Texas, so it is with some pride that I recommend the recurring series, Tech Tools with Tine, from the Texas State Library and Archives Commission.  If you’re like me, you like your tech talks in manageable bite-size pieces. This is just your style.

September 19th, 9-10 AM EST – Tech Tools with Tine: 1 Hour of Google Drive

September 26th, 9-10 AM EST – Tech Tools with Tine: 1 Hour of MailChimp

October 3rd, 9-10 AM EST – Tech Tools with Tine: 1 Hour of Curation with Pinterest and Tumblr

Show Off Your Stuff –

The deadline to submit a proposal to the 2015 Library Technology Conference at Macalester College in beautiful St. Paul is September 22nd. Maybe that tight timeline is just the motivation you’ve been looking for!

What’s up, Tiger Lily? -

Are you a tech caterpillar or a tech butterfly? Do you have any cool free webinars or opportunities you’d like to share? Write me all about it in the comments.

by Lindsay Cronk at September 19, 2014 12:30 PM

September 18, 2014

OCLC Dev Network

Release Scheduling Update

To accommodate additional performance testing and optimization, the September release of WMS, which includes changes to the WMS Vendor Information Center API, is being deferred.  We will communicate the new date for the release as soon as we have confirmation.

by Shelley Hostetler at September 18, 2014 09:30 PM

District Dispatch

The Goodlatte, the bad and the ugly…

My Washington Office colleague Carrie Russell, ALA’s copyright ace in the Office of Information Technology Policy, provides a great rundown here in DD on the substantive ins and outs of the House IP Subcommittee’s hearing yesterday. The Subcommittee met to take testimony on the part of the 1998 Digital Millennium Copyright Act (Section 1201, for those of you keeping score at home) that prohibits anyone from “circumventing” any kind of “digital locks” (aka, “technological protection measures,” or “TPMs”) used by their owners to protect copyrighted works. The hearing was also interesting, however, for the politics of the emerging 1201 debate on clear display.

First, the good news.  Rep. Bob Goodlatte (VA), Chairman of the full House Judiciary Committee, made time in a no doubt very crowded day to attend the hearing specifically for the purpose of making a statement in which he acknowledged that targeted reform of Section 1201 was needed and appropriate.  As one of the original authors of 1201 and the DMCA, and the guy with the big gavel, Mr. Goodlatte’s frank and informed talk was great to hear.

Likewise, Congressman Darrell Issa of California (who’s poised to assume the Chairmanship of the IP Subcommittee in the next Congress and eventually to succeed Mr. Goodlatte at the full Committee’s helm) agreed that Section 1201 might well need modification to prevent it from impeding technological innovation — a cause he’s championed over his years in Congress as a technology patent-holder himself.

Lastly, Rep. Blake Farenthold added his voice to the reform chorus.  While a relatively junior Member of Congress, Rep. Farenthold clearly “gets” the need to assure that 1201 doesn’t preclude fair use or valuable research that requires digital locks to be broken precisely to see if they create vulnerabilities in computer apps and networks that can be exploited by real “bad guys,” like malware- and virus-pushing lawbreakers.

Of course, any number of other members of the Subcommittee were singing loudly in the key of “M” for yet more copyright protection.  Led by the most senior Democrat on the full Judiciary Committee, Rep. John Conyers (MI), multiple members appeared (as Carrie described yesterday) to believe that “strengthening” Section 1201 in unspecified ways would somehow thwart … wait for it … piracy, as if another statute and another penalty would do anything to affect the behavior of industrial-scale copyright infringers in China who don’t think twice now about breaking existing US law.  Sigh….

No legislation is yet pending to change Section 1201 or other parts of the DMCA, but ALA and its many coalition partners in the public and private sectors will be in the vanguard of the fight to reform this outdated and ill-advised part of the law (including the triennial process by which exceptions to Section 1201 are granted, or not) next year.  See you there!

The post The Goodlatte, the bad and the ugly… appeared first on District Dispatch.

by Adam Eisgrau at September 18, 2014 08:55 PM


Say Hello to Lucidworks Fusion

The team at Lucidworks is proud to announce the release of our next-generation platform for building powerful, scalable search applications: Lucidworks Fusion.

Fusion extends any Solr deployment with the enterprise-grade capabilities you need to deliver a world-class search experience:

Full support for any Solr deployment including Lucidworks Search, SolrCloud, and stand-alone mode.

Deeper support for recommendations including Item-to-Query, Query-to-Item, and Item-to-Item with aggregated signals.

Advanced signal processing including any datapoint (click-through, purchases, ratings) – even social signals like Twitter.

Enhanced application development with REST APIs, index-side and query-time pipelines, with sophisticated connector frameworks.

Advanced web and filesystem crawlers with multi-threaded HTML/document connectors, de-duping, and incremental crawling.

Integrated security management for roles and users supporting HTTPs, form-based, Kerberos, LDAP, and native methods.

Search, log, and trend analytics for any log type with real-time and historical data with SiLK.

Ready to learn more? Join us for our upcoming webinar:

Webinar: Meet Lucidworks Fusion

Join Lucidworks CTO Grant Ingersoll for a ‘first look’ at our latest release, Lucidworks Fusion. You’ll be among the first to see the power of the Fusion platform and how it gives you everything you need to design, build, and deploy amazing search apps.

Webinar: Meet Lucidworks Fusion
Date: Thursday, October 2, 2014
Time: 11:00 am Pacific Daylight Time (San Francisco, GMT-07:00)

Click here to register for this webinar.

Or learn more at

by Lucidworks at September 18, 2014 08:43 PM

John Miedema

Wilson iteration plans: Topics on text mining the novel.

The Wilson iteration of my cognitive system will involve a deep dive into topics on text mining the novel. My overly ambitious plans are the following, roughly in order:

We’ll see where it goes.

by johnmiedema at September 18, 2014 08:27 PM


Nearly 100,00 items from the Getty Research Institute now available in DPLA

More awesome news from DPLA! Hot on the heels of announcements earlier this week about newly added materials from the Medical Heritage Library and the Government Printing Office, we’re excited to share today that nearly 100,000 items from the Getty Research Institute are now available via DPLA.

To view the Getty in DPLA, click here.

From an announcement posted today on the Getty Research Institute Blog:

As a DPLA content hub, the Getty Research Institute has contributed metadata—information that enables search and retrieval of material—for nearly 100,000 digital images, documentary photograph collections, archives, and books dating from the 1400s to today. We’ve included some of the most frequently requested and significant material from our holdings of more than two million items, including some 5,600 images from the Julius Shulman photography archive, 2,100 images from the Jacobson collection of Orientalist photography, and dozens of art dealers’ stockbooks from the Duveen and Knoedler archives.

The Getty will make additional digital content available through DPLA as their collections continue to be cataloged and digitized.

cc-by-iconAll written content on this blog is made available under a Creative Commons Attribution 4.0 International License. All images found on this blog are available under the specific license(s) attributed to them, unless otherwise noted.

by DPLA at September 18, 2014 08:03 PM

Alf Eaton, Alf

Archiving and displaying tweets with dat

First, make a new directory for the project:

mkdir tweet-stream && cd $_

Install node.js (nodejs in Debian/Ubuntu, node in Homebrew), update npm if needed (npm install -g npm) and install dat:

npm install -g maxogden/dat

dat is essentially git for data, so the data repository needs to be initialised before it can be used:

dat init

Next, start the dat server to listen for incoming connections:

dat listen

Data can be piped into dat as line-delimited JSON (i.e. one object per line - the same idea as CSV but with optional nested data). Happily, this is the format in which Twitter’s streaming API provides information, so it's ideal for piping into dat.

I used a PHP client to connect to Twitter’s streaming API as I was interested in seeing how it handled the connection (the client needs to watch the connection and reconnect if no data is received in a certain time frame). There may be a command-line client that is even easier than this, but I haven’t found one yet…

Install Phirehose using Composer:

composer init && composer require fennb/phirehose:dev-master && composer install

The streaming API uses OAuth 1.0 for authentication, so you have to register a Twitter application to get an OAuth consumer key and secret, then generate another access token and secret for your account. Add these to this small PHP script that initialises Phirehose, starts listening for filtered tweets and outputs each tweet to STDOUT as it arrives:

Run the script to connect to the streaming API and start importing data: php stream.php | dat import -json

The dat server that was started earlier with dat listen is listening on port 6461 for clients, and is able to emit each incoming tweet as a Server-Sent Event, which can then be consumed in JavaScript using the EventSource API.

I’m in the process of making a twitter-stream Polymer element, but in the meantime this is how to connect to dat’s SSE endpoint:

var server = new EventSource(‘http://your-dat-ip-address:6461/api/changes?data=true&style=sse&live=true&limit=1&tail=1’);

server.addEventListener('data', function(event) {
    var item = JSON.parse(;
    // do something with the tweet

September 18, 2014 07:47 PM

Patrick Hochstenbach

Hard Reset

Joining Hard Reset a playground for illustrators to draw cartoons about a post apocalyptic world. These doodles I can draw during my 20 minute commute from Brugge to Ghent.Filed under: Comics Tagged: art, cartoon, comic, comics, commute, copic, doodle,

by hochstenbach at September 18, 2014 06:52 PM

Jonathan Rochkind

Umlaut 4.0 beta

Umlaut is an open source specific-item discovery layer, often used on top of SFX, and based on Rails.

Umlaut 4.0.0.beta2 is out! (Yeah, don’t ask about beta1 :) ).

This release is mostly back-end upgrades, including:

Anyone interested in beta testing? Probably most interesting if you have an SFX to point it at, but you can take it for a spin either way.

To install a new Umlaut app, see:

Filed under: General

by jrochkind at September 18, 2014 06:39 PM

Andromeda Yelton

jQuery workshop teaching techniques, part 3: ruthless backward design

I’m writing up what I learned from teaching a jQuery workshop this past month. I’ve already posted on my theoretical basis, pacing, and supporting affective goals. Now for the technique I invested the most time in and got the most mileage out of…

Ruthless backward design

Yes, yes, we all know we are supposed to do backward design, and I always have a general sense of it in my head when I design courses. In practice it’s hard, because you can’t always carve out the time to write an entire course in advance of teaching it, but for a two-day bootcamp I’m doing that anyway

Yeah. Super ruthless. I wrote the last lesson, on functions, first. Along the way I took notes of every concept and every function that I relied on in constructing my examples. Then I wrote the second-to-last lesson, using what I could from that list (while keeping the pacing consistent), and taking notes on anything else I needed to have already introduced – again, right down to the granularity of individual jQuery functions. Et cetera. My goal was that, by the time they got to writing their own functions (with the significant leap in conceptual difficulty that entails), they would have already seen every line of code that they’d need to do the core exercises, so they could work on the syntax and concepts specific to functions in isolation from all the other syntax and concepts of the course. (Similarly, I wanted them to be able to write loops in isolation from the material in lessons 1 and 2, and if/then statements in isolation from the material in lesson 1.)

This made it a lot easier for me to see both where the big conceptual leaps were and what I didn’t need. I ended up axing .css() in favor of .addClass(), .removeClass(), and .hasClass() – more functions, but all conceptually simpler ones, and more in line with how I’ve written real-world code anyway. It meant that I axed booleans – which in writing out notes on course coverage I’d assumed I’d cover (such a basic data type, and so approachable for librarians!) – when I discovered I did not need their conceptual apparatus to make the subsequent code make sense. It made it clear that .indexOf() is a pain, and students would need to be familiar with its weirdness so it didn’t present any hurdles when they had to incorporate it into bigger programs.

Funny thing: being this ruthless and this granular meant I actually did get to the point where I could have done real-world-ish exercises with one more session. I ended up providing a few as further practice options for students who chose jQuery practice rather than the other unconference options for Tuesday afternoon. By eliminating absolutely everything unnecessary, right down to individual lines of code, I covered enough ground to get there. Huh!

So yeah. If I had a two-day workshop, I’d set out with that goal. A substantial fraction of the students would feel very shaky by then – it’s still a ton of material to assimilate, and about a third of my survey respondents’ brains were full by the time we got to functions – but including a real-world application would be a huge motivational payoff regardless. And group work plus an army of TAs would let most students get some mileage out of it. Add an option for people to review earlier material in the last half-day, and everyone’s making meaningful progress. Woot!

Also, big thanks to Sumana Harihareswara for giving me detailed feedback on a draft of the lesson, and helping me see the things I didn’t have the perspective to see about sequencing, clarity, etc. You should all be lucky enough to have a proofreader so enthusiastic and detail-oriented.

Later, where I want to go next.

by Andromeda at September 18, 2014 05:04 PM

Open Knowledge Foundation

Announcing a Leadership Update at Open Knowledge

Today I would like to share some important organisational news. After 3 years with Open Knowledge, Laura James, our CEO, has decided to move on to new challenges. As a result of this change we will be seeking to recruit a new senior executive to lead Open Knowledge as it continues to evolve and grow.

As many of you know, Laura James joined us to support the organisation as we scaled up, and stepped up to the CEO role in 2013. It has always been her intention to return to her roots in engineering at an appropriate juncture, and we have been fortunate to have had Laura with us for so long – she will be sorely missed.

Laura has made an immense contribution and we have been privileged to have her on board – I’d like to extend my deep personal thanks to her for all she has done. Laura has played a central role in our evolution as we’ve grown from a team of half-a-dozen to more than forty. Thanks to her commitment and skill we’ve navigated many of the tough challenges that accompany “growing-up” as an organisation.

There will be no change in my role (as President and founder) and I will be here both to continue to help lead the organisation and to work closely with the new appointment going forward. Laura will remain in post, continuing to manage and lead the organisation, assisting with the recruitment and bringing the new senior executive on board.

For a decade, Open Knowledge has been a leader in its field, working at the forefront of efforts to open up information around the world and and see it used to empower citizens and organisations to drive change. Both the community and original non-profit have grown – and continue to grow – very rapidly, and the space in which we work continues to develop at an incredible pace with many exciting new opportunities and activities.

We have a fantastic future ahead of us and I’m very excited as we prepare Open Knowledge to make its next decade even more successful than its first.

We will keep everyone informed in the coming weeks as our plans develop, and there will also be opportunities for the Open Knowledge community to discuss. In the meantime, please don’t hesitate to get in touch with me if you have any questions.

by Rufus Pollock at September 18, 2014 03:05 PM

District Dispatch

Free webinar: Helping patrons set financial goals

Image of credit card.On September 23rd, the Consumer Financial Protection Bureau and the Institute for Museum and Library Services will offer a free webinar on financial literacy. This session has limited space so please register quickly.

Sometimes, if you’re offering programs on money topics, library patrons may come to you with questions about setting money goals. To assist librarians, the Consumer Financial Protection Bureau and the Institute of Museum and Library Services are developing financial education tools and sharing best practices with the public library field.

The two agencies created the partnership to help libraries provide free, unbiased financial information and referrals in their communities, build local partnerships and promote libraries as community resources. As part of the partnership, both agencies gathered information about libraries and financial education. Their surveys focused on attitudes about financial education, and how librarians can facilitate more financial education programs.

Join both groups on Tuesday, September 23, 2014, from 2:30–3:30p.m. Eastern Time for the free webinar “Setting money goals,” which will explore the basics of money management. The webinar will teach participants how to show patrons to create effective money goals.

Webinar Details

September 23, 2014
2:30–3:30p.m. Eastern
Join the webinar (No need to RSVP)

If you are participating only by phone, please dial the following number:

The post Free webinar: Helping patrons set financial goals appeared first on District Dispatch.

by Emily Sheketoff at September 18, 2014 02:51 PM

OCLC Dev Network

Reminder: Developer House Nominations Close on Monday

If you've been thinking about nominating someone – including yourself - for Developer House this December, there’s no time like the present to submit that nomination form.

by Shelley Hostetler at September 18, 2014 02:45 PM

Open Knowledge Foundation

Launching a new collaboration in Macedonia with Metamorphosis and the UK Foreign & Commonwealth Office


As part of the The Open Data Civil Society Network Project, School of Data Fellow, Dona Djambaska, who works with the local independent nonprofit, Metamorphosis, explains the value of the programme and what we hope to achieve over the next 24 months.

“The concept of Open Data is still very fresh among Macedonians. Citizens, CSOs and activists are just beginning to realise the meaning and power hidden in data. They are beginning to sense that there is some potential for them to use open data to support their causes, but in many cases they still don’t understand the value of open data, how to advocate for it, how to find it and most importantly – how to use it!

Metamorphosis was really pleased to get this incredible opportunity to work with the UK Foreign Office and our colleagues at Open Knowledge, to help support the open data movement in Macedonia. We know that an active open data ecosystem in Macedonia, and throughout the Balkan region, will support Metamorphosis’s core objectives of improving democracy and increasing quality of life for our citizens.

It’s great to help all these wonderful minds join together and co-build a community where everyone gets to teach and share. This collaboration with Open Knowledge and the UK Foreign Office is a really amazing stepping-stone for us.

We are starting the programme with meet-ups and then moving to more intense (online and offline) communications and awareness raising events. We hope our tailored workshops will increase the skills of local CSOs, journalists, students, activists or curious citizens to use open data in their work – whether they are trying to expose corruption or find new efficiencies in the delivery of government services.

We can already see the community being built, and the network spreading among Macedonian CSOs and hope that this first project will be part of a more regional strategy to support democratic processes across the Balkan region.”

Read our full report on the project: Improving governance and higher quality delivery of government services in Macedonia through open data

Dona Djambaska, Macedonia.

Dona graduated in the field of Environmental Engineering and has been working with the Metamorphosis foundation in Skopje for the past six years assisting on projects in the field of information society.

There she has focused on organising trainings for computer skills, social media, online promotion, photo and video activism. Dona is also an active contributor and member of the Global Voices Online community. She dedicates her spare time to artistic and activism photography.

by Guest at September 18, 2014 02:07 PM

Ed Summers

Satellite of Art

… still there

by ed at September 18, 2014 01:26 PM

FOSS4Lib Recent Releases

BitCurator - 0.9.20

Last updated September 18, 2014. Created by Peter Murray on September 18, 2014.
Log in to edit this page.

Release Date: 
Friday, September 5, 2014

by Peter Murray at September 18, 2014 12:36 PM

Peter Murray

Thursday Threads: Patron Privacy on Library Sites, Communicating with Developers, Kuali Continued

Receive DLTJ Thursday Threads:

by E-mail

by RSS

Delivered by FeedBurner

In the DLTJ Thursday Threads this week: an analysis of how external services included on library web pages can impact patron privacy, pointers to a series of helpful posts from OCLC on communication between software users and software developers, and lastly an update on the continuing discussion of the Kuali Foundation Board’s announcement forming a commercial entity.

Before we get started on this week’s threads, I want to point out a free online symposium that LYRASIS is performing next week on sustainable cultural heritage open source software. Details are on the FOSS4Lib site, you can register on the LYRASIS events site, and then join the open discussion on the site before, during and after the symposium.

Feel free to send this to others you think might be interested in the topics. If you find these threads interesting and useful, you might want to add the Thursday Threads RSS Feed to your feed reader or subscribe to e-mail delivery using the form to the right. If you would like a more raw and immediate version of these types of stories, watch my Pinboard bookmarks (or subscribe to its feed in your feed reader). Items posted to my Pinboard bookmarks are also sent out as tweets; you can follow me on Twitter. Comments and tips, as always, are welcome.

Analysis of Privacy Leakage on a Library Catalog Webpage

My post last month about privacy on library websites, and the surrounding discussion on the Code4Lib list prompted me to do a focused investigation, which I presented at last weeks Code4Lib-NYC meeting.
I looked at a single web page from the NYPL online catalog. I used Chrome developer tools to trace all the requests my browser made in the process of building that page. The catalog page in question is for The Communist Manifesto. It’s here: …

So here are the results.

- Analysis of Privacy Leakage on a Library Catalog Webpage, by Eric Hellman, Go To Hellman, 16-Sep-2014

Eric goes on to note that he isn’t criticizing the New York Public Library, but rather looking at a prominent system with people who are careful of privacy concerns — and also because NYPL was the host of the Code4Lib-NYC meeting. His analysis of what goes on behind the scenes of a web page is illuminating, though, and how all the careful work to protect patron’s privacy while browsing the library’s catalog can be brought down by the inclusion of one simple JavaScript widget.

Series of Posts on Software Development Practices from OCLC

This is the first post in a series on software development practices. We’re launching the series with a couple of posts aimed at helping those who might not have a technical background communicate their feature requests to developers.

- Software Development Practices: What&aposs the Problem?, by Shelly Hostetler, OCLC Developer Network, 22-Aug-2014

OCLC has started an excellent set of posts on how to improve communication between software users and software developers. The first three have been posted so far with another one expected today:

  1. Software Development Practices: What&aposs the Problem?
  2. Software Development Practices: Telling Your User&aposs Story
  3. Software Development Practices: Getting Specific with Acceptance Criteria

I’ve bookmarked them and will be referring to them when talking with our own members about software development needs.

Kuali 2.0 Discussion Continues

…I thought of my beehives and how the overall bee community supports that community/ hive. The community needs to be protected, prioritized, supported and nourished any way possible. Each entity, the queen, the workers and the drones all know their jobs, which revolve around protecting supporting and nourishing the community.

Even if something disrupts the community, everyone knows their role and they get back to work in spite of the disruption. The real problem within the Kuali Community, with the establishment of the Kuali Commercial Entity now is that various articles, social media outlets, and even the communication from the senior Kuali leadership to the community members, have created a situation in which many do not have a good feel for their role in protecting, prioritizing, supporting and nourishing the community.

- The Evolving Kuali Narrative, by Kent Brooks, “I was just thinking”, 14-Sep-2014

The Kuali Foundation Board has set a direction for our second decade and at this time there are many unknowns as we work through priorities and options with each of the Kuali Project Boards. Kuali is a large and complex community of many institutions, firms, and individuals. We are working with projects now and hope to have some initial roadmaps very soon.

- Updates – Moving at the Speed of Light, by Jennifer Foutty, Kuali 2.0 Blog, 17-Sep-2014

As the library community that built a true next-generation library management system, the future of OLE’s development and long-term success is in our hands. We intend to continue to provide free and open access to our community designed and built software. The OLE board is strongly committed to providing a community driven option for library management workflow.

- Open Library Environment (OLE) & Kuali Foundation Announcement, by Bruce M. Taggart (Board Chair, Open Library Environment (OLE)), 9-Sep-2014

Building on previous updates here, the story of the commercialization of the Kuali collaborative continues. I missed the post from Bruce Taggart in last week’s update, and for the main DLTJ Thursday Threads audience this status update from the Open Library Environment project should be most interesting. Given the lack of information, it is hard not to parse each word of formal statements for underlying meanings. In the case of Dr. Taggart’s post about OLE, I’m leaning heavily on wondering what “community designed and built software” means. The Kuali 2.0 FAQ still says “the current plan is for the Kuali codebase to be forked and relicensed under the Affero General Public License (AGPL).” As Charles Severance points out, the Affero license can be a path to vendor lock-in. So is there to be a “community” version that has a life of its own in under the Educational Community License while the KualiCo develops features only available under the Affero license? It is entirely possible that too much can be read into too few words, so I (for one) continue to ponder these questions and watch for the plan to evolve.

by Peter Murray at September 18, 2014 10:58 AM

FOSS4Lib Updated Packages

Consider the Removal Agency that You Can Believe

Last updated September 18, 2014. Created by mohit03 on September 18, 2014.
Log in to edit this page.

Transferring your house and also place of work are often very thrilling, since you also are generally on the point of check out a brand-new area, commence a brand-new living and possess entertaining. Unless you make it, although, you must move through a fairly exhausting along with strenuous timeframe, since you must set up packaging in addition to shifting. Specially when an individual transfer long-distance areas, it is advisable to select a suitable removing corporation that may control the work and also relieve ones tension.

How will you select the right treatment business?

To begin with, you'll want to seek the services of a genuine elimination business that may lead the particular removals method, however really not a robust gentleman who may have the truck. It is possible to inquire your buddies that have relocated lately as long as they employed any organization, after which, you'll want to validate their everyday living and it is permit with the relevant institution. You should make certain that the business is present for quite a while and this it's got every one of the right permit to control the particular large devices important for your home removals.

Requesting stories and also checking out the organization site is perfectly essential. You'll be able to constantly use the internet here and also Yahoo your treatment organization to find out and about in the event you will find just about any grievances as well as adverse feedback. Researching this testimonies you will find while using friends' as well as household tips is critical.

Its also wise to review rates as well as rates; a superb packers in addition to movers organization isn't going to often provide cheapest price, and you must big event a person examine the particular supplied products and services with all the charges supplied. Just like most businesses, removals organizations can also be prepared to take discussions. You are able to absolutely discuss the purchase price, particularly if a person call the corporation beginning ample, you'll be able for you to request a number of cheaper costs.

Comparing prices is additionally encouraged. Do not need achieve a deal, in addition to use the removals firm straight away, however, you must talk with a lot of businesses. When you continue deciding on this treatment corporation in the very last minute, you'll likely receive increased prices, since these businesses usually are ordered a little while ahead of time, so that you really should commence the procedure early on along with avoid your top days and nights in addition to holiday seasons.

You possibly can reduce your cost in case you begin providing all on your own. If you choose to make it happen, you should commence setting up along with supplying your personal property beginning. This kind of will assist you to stay away from demanding occasions in addition to worries. Take into account that each one of these have become time intensive, and also you'll want to always be ready upfront.

packers and movers bikaner
packers and movers in bareilly
movers and packers bareilly
movers and packers in agra
amroha packers and movers
alwar movers and packers


by mohit03 at September 18, 2014 10:31 AM

FOSS4Lib Upcoming Events

Necessary Questions to Ask Packers Movers Agencies

Thursday, September 18, 2014 - 02:00

Last updated September 18, 2014. Created by sweta2806 on September 18, 2014.
Log in to edit this page.

Going into a fresh desired destination using total residence things could be facilitated along with easier should you timetable the activity which has a high quality as well as trusted going business (packers and also movers). There are lots of specialized movers in addition to packers firms in several metropolitan areas connected with Indian available. Although figuring out the best mover can be quite a small bit complicated and also demanding work.

by sweta2806 at September 18, 2014 06:05 AM

Responsibility of Specialized Jaipur Movers and Packers

Thursday, September 18, 2014 - 02:00

Last updated September 18, 2014. Created by sweta2806 on September 18, 2014.
Log in to edit this page.

Regardless of whether you might be shifting across the street, relocating derived from one of vicinity to a different or maybe relocating through Jaipur to be able to someplace else; a fantastic transferring firm will likely be the true good friend. An excellent going business might help together with your come in the whole procedure producing procedure get simpler -- through packaging of the very most very first merchandise for your existing spot to unpacking in addition to ordering of the very most previous merchandise pictures brand-new position.

by sweta2806 at September 18, 2014 06:04 AM

September 17, 2014

District Dispatch

Celebrating Constitution Day the advocacy way

 “Dear Congressional Leaders –

We write to urge you to bring to the floor S. 607 and H.R. 1852, the bipartisan Leahy-Lee and Yoder-Polis bills updating the Electronic Communications Privacy Act (ECPA). Updating ECPA would respond to the deeply held concerns of Americans about their privacy. S. 607 and H.R. 1852 would make it clear that the warrant standard of the U.S. Constitution applies to private digital information just as it applies to physical property….”

… So said ALA today and more than 70 other civil liberties organizations, major technology companies and trade associations — including the U.S. Chamber of Commerce — in a strong joint letter to the leaders of the House and Senate calling for the soonest possible vote on bills pending in each chamber (S. 607 and H.R. 1852) to update the woefully outdated and inadequate Electronic Communications Privacy Act.  To reach every Member of Congress and their staffs, the letter also was published as a full page advertisement in Roll Call, a principal Capitol Hill newspaper widely read inside the Beltway and well beyond.

When last discussed in DD in mid-June, H.R. 1852 (the Email Privacy Act) had been cosponsored by a majority of all Members of the House.  Today, 265 members have signed on but the bill still awaits action in Committee.  With literally two work days remaining before the House and Senate recess for the November election, ALA and scores of its coalition partners wanted to remind all Members that these bills deserve a vote immediately after Congress returns in November.

Add your voice to that call too as Election 2014 heats up where you live!  Attend a “Town Hall” meeting, call in to a talk radio show featuring a campaigning Congressperson, or simply call their local office and demand that Congress protect your emails, photos, texts, tweets and anything else stored in the “cloud” by voting on and passing S. 607 and H.R. 1852.  Politics doesn’t get any more local and personal than the privacy of your electronic communications, which authorities don’t now need a warrant to pore over if they’re more than six months old.

Tell your Congressional Representative and Senators to update ECPA by passing S. 607 and H.R. 1852 as soon as they get back to Washington.

The post Celebrating Constitution Day the advocacy way appeared first on District Dispatch.

by Adam Eisgrau at September 17, 2014 11:08 PM

Getting on the same page

Image of a bookIt can be difficult to respond to a question asked by a Member of Congress at a hearing when that person is talking about a different subject than you are and doesn’t know it. One observes a lot of talking past one another and frustration. One wants to stand up and say “wait a minute, you guys are talking about two different things,” but this kind of outburst is not appropriate at a Congressional hearing.

That happened today at the hearing called by U.S. House Judiciary Subcommittee on the Courts, Intellectual Property and the Internet. The topic was Chapter 12 of the copyright law and in particular, an administrative process conducted every three years by the U.S. Copyright Office called the 1201 rulemaking. But some thought the topic was digital rights management, and things got a little tense near the end of the hearing. Watch it for yourself.

There is a connection, and for clarity’s sake, let’s explore. The 1201 rulemaking was included in the Digital Millennium Copyright Act (DMCA) as a “safety valve” to ensure that technological protection measures (also known as digital rights management!) employed by rights holders to protect content would not also prevent non-infringing uses of copyrighted works, like analyzing software for security vulnerabilities, for example. Ask anyone, and they will tell you that the rulemaking is brutal. It’s long, convoluted and borders on the ridiculous. During this process, the U.S. Copyright Office evaluates specific requests for exemption from Section 1201’s otherwise blanket prohibition on “circumvention,” e.g., breaking pass codes, encryption or other digital rights management schemes in order to make a non-infringing use of a particular class of copyrighted works. In order to make such an argument, however, one who wants an exemption to the anti-circumvention provision must already have broken the anti-circumvention provision in order to make a non-infringing use of the work because you cannot speculate that a non-infringing use is possible without demonstrating that it is so.

Broadcast live streaming video on Ustream

The process can last eight months and includes writing detailed comments for submission, a reply comment period, two days of roundtables sometimes held in two or three places in the United States, and finally time for the U.S. Copyright Office in collaboration with the National Telecommunications and Information Administration (NTIA) to write a lengthy report with recommendations to the Librarian of Congress what class of works with technological protection measures can be circumvented for the next three years. Whew!

The Library Copyright Alliance (LCA) submitted comments arguing that the process certainly can be improved. Key LCA recommendations included that exemptions be permanent instead of lasting only three years, and that the NTIS (which has a better understanding of technology and innovation)administer the 1201 rulemaking process instead of the U.S. Copyright Office.

The good news. A baby step may have been taken. All of the witnesses agreed that some exemptions should be permanent so people do not have to reargue their case every three years. In addition, the Copyright Office already has made a suggestion to improve the rulemaking process, writing recently in the Federal Register:

Unlike in previous rulemakings, the Office is not requesting the submission of complete legal and factual support for such proposals at the outset of the proceeding. Instead, in this first step of the process, parties seeking an exemption may submit a petition setting forth specified elements of the proposed exemption and review and consolidate the petitions naming the list of proposed exemptions for further consideration.

Stay tuned for more news on the Copyright Office’s so-called “triennial” 1201 rulemaking which gives new meaning to the adage that “god (or the devil, if you prefer) is in the details.”

The post Getting on the same page appeared first on District Dispatch.

by Carrie Russell at September 17, 2014 09:59 PM


Open Technical Advisory Committee Call: September 23, 1:00 PM Eastern

The Technical Advisory Committee will hold an open call on Tuesday, September 23 at 1:00 PM EDT. The agenda can be found below. To register, follow the link below and complete the short form.

You can register for the call by visiting


View in Google Docs for live notetaking


cc-by-iconAll written content on this blog is made available under a Creative Commons Attribution 4.0 International License. All images found on this blog are available under the specific license(s) attributed to them, unless otherwise noted.

by DPLA at September 17, 2014 09:00 PM


LITA Scholarships in Library and Information Science

CHICAGO — The Library and Information Technology Association (LITA), a division of the American Library Association, is pleased to announce that applications are being accepted for three Scholarships:

LITA/Christian Larew Memorial Scholarship (sponsored by Baker & Taylor)

LITA/LSSI Minority Scholarship (sponsored by Library Systems and Services, LLC)

LITA/OCLC Minority Scholarship (sponsored by Online Computer Library Center)

The scholarships are designed to encourage the entry of qualified persons into the library technology field.  The committees seek those who plan to follow a career in library and information technology, who demonstrate potential leadership, who hold a strong commitment to the use of automated systems in libraries and, for the minority scholarships, those who are qualified members of a principal minority group (American Indian or Alaskan native, Asian or Pacific Islander, African-American or Hispanic).

Candidates should illustrate their qualifications for the scholarships with a statement indicating the nature of their library experience, letters of reference and a personal statement of the applicant’s view of what he or she can bring to the profession, with particular emphasis on experiences that indicate potential for leadership and commitment to library automation.  Economic need is considered when all other criteria are equal.  Winners must have been accepted to an ALA recognized MLS Program.

You can apply for LITA scholarships through the single online application hosted by the ALA Scholarship Program. The ALA Scholarship Application Database will open Sept. 15.

References, transcripts and other documents must be postmarked no later than March 1, 2015 for consideration.  All materials should be submitted to American Library Association, Scholarship Clearinghouse, c/o Human Resource Development & Recruitment, 50 East Huron Street, Chicago, IL  60611-2795.  If you have questions about a LITA Scholarships please email the LITA Office at

The winners will be announced at the LITA President’s Program at the 2015 ALA Annual Conference in San Francisco.

by vedmonds at September 17, 2014 08:56 PM

LITA/Library Hi Tech Award Nominations Sought

Nominations are being accepted for the 2015 LITA/Library Hi Tech Award, which is given each year to an individual or institution for outstanding achievement in educating the profession about cutting edge technology through communication in continuing education within the field of library and information technology. Sponsored by the Library and Information Technology Association (LITA), a division of the American Library Association (ALA), and Library Hi Tech, the award includes a citation of merit and a $1,000 stipend provided by Emerald Group Publishing Limited, publishers of Library Hi Tech. The deadline for nominations is December 1, 2014.

The award, given to either a living individual or an institution, may recognize a single seminal work or a body of work created during or continuing into the five years immediately preceding the award year. The body of work need not be limited to published texts, but can include course plans or actual courses and/or non-print publications such as visual media. Awards are intended to recognize living persons rather than to honor the deceased; therefore, awards are not made posthumously. More information and a list of previous winners can be found at in the Awards and Scholarships section.

Currently serving officers and elected officials of LITA, members of the LITA/Library Hi Tech Award Committee, and employees and their immediate family of Emerald Group Publishing are ineligible.

Nominations must include the name(s) of the recipient(s), basis for nomination, and references to the body of work.  Electronic submissions are preferred, but print submissions may also be sent to the LITA/Library Hi Tech Award Committee chair:

Holly Yu
University Library
California State University, Los Angeles
5151 State University Dr
Los Angeles, CA 90032-4226.

The award will be presented at the LITA President’s Program during the 2015 Annual Conference of the American Library Association in San Francisco.

About Emerald

Emerald is a global publisher linking research and practice to the benefit of society. The company manages a portfolio of more than 290 journals and over 2,350 books and book series volumes. It also provides an extensive range of value-added products, resources and services to support its customers’ needs. Emerald is a partner of the Committee on Publication Ethics (COPE) and works with Portico and the LOCKSS initiative for digital archive preservation. It also works in close collaboration with a number of organizations and associations worldwide.

About LITA

Established in 1966, LITA is the leading organization reaching out across types of libraries to provide education and services for a broad membership of almost 3,000 system librarians, library administrators, library schools, vendors and many others interested in leading edge technology and applications for librarians and information providers. For more information, visit , or contact the LITA office at 800-545-2433, ext. 4268; or e-mail:

For further information, contact Mary Taylor at LITA, 312-280-4267.

by Mark Beatty at September 17, 2014 07:43 PM

Andromeda Yelton

counting keynoter diversity in libraryland

Recently I mentioned to someone that the library speaker circuit is male-dominated, and she was surprised to hear it. It’s certainly a thing that feels overwhelming from the inside — I’ve been part of a 40% female speaker lineup in front of a 90% female audience — but maybe it’s not as much of a thing as I think?

Well. I counted speaker diversity at LITA Forum once; I can count keynote speakers at big library conferences too.

The takeaway: not as bad as I thought gender-wise but still pretty bad for a field that’s 80% female — except, oddly, library technology does better than the average. On the other hand, if you’re looking for non-white keynoters…it’s pretty bad.

In national-scale US/Canadian library conferences…

In national-scale US/Canadian library technology conferences…

A nice surprise

I honestly didn’t expect library tech to do better than the average, gender-wise. This is partly a function of tiny little sample size – only 14 keynoters. But it’s also a reminder that a few people can have a lot of leverage. A big part of what you’re seeing here is that code4lib decided to care: code4lib members went out of their way to nominate female keynoters, and keynoters who can speak to feminist issues, and in the open vote that ensued, the two winners were female. LibTechConf organizers went out of their way to solicit diverse speakers, too. And either of them alone tips the scale to majority female keynoters in libtech.

Thanks, code4lib and LibTechConf. You’re awesome.

Sumana Harihareswara

Sumana gave a killer talk at Code4Lib 2014. This made her one-third of all keynoters of Asian heritage in libraryland last year, and the only Indian-American. Yikes.


I was looking specifically at keynote speakers — the ones who get invited, paid, and put on a stage in front of the full audience. The ones we showcase as representatives of our values and interests; the ones we value most, metaphorically and literally. The ones we ask.

Not everyone uses the term “keynote”; I also counted “opening/closing general session”, “plenary”, and (in the case of ALA Midwinter, which lacks all of those things) “auditorium speaker series”.

I looked at the most recent iteration of the following conferences:

AALL, AASL, Access, ACRL, ALA Annual, ALA Midwinter, ALSC national institute, ASIS&T, code4lib, DLF, LibTechConf, LITA Forum, MLA, OLA Super Conference, OLITA Digital Odyssey, PLA, and SLA. (YALSA’s Symposium doesn’t seem to have keynoters.)

That’s pretty much what I thought of off the top of my head, biased toward libtech since that’s where I have the most awareness. Happy to add more and update accordingly!

Reminder: why I do this

This is what I ask: when you walk into a room, count. Count the women. Count the people of color. Count by race. Look for who isn’t there. Look for class signs: the crooked teeth of childhoods without braces, worn-out shoes, someone else who is counting. Look for the queers, the older people, the overweight. Note them, see them, see yourself looking, see yourself reacting.

This is how we begin.

– Quinn Norton, Count

by Andromeda at September 17, 2014 07:37 PM


Nominations Sought for Prestigious Kilgour Research Award

Nominations are invited for the 2015 Frederick G. Kilgour Award for Research in Library and Information Technology, sponsored by OCLC, Inc. and the Library and Information Technology Association (LITA), a division of the American Library Association (ALA). The deadline for nominations is December 31, 2014.

The Kilgour Research Award recognizes research relevant to the development of information technologies, in particular research showing promise of having a positive and substantive impact on any aspect of the publication, storage, retrieval and dissemination of information or how information and data are manipulated and managed. The Kilgour award consists of $2,000 cash, an award citation and an expense paid trip (airfare and two nights lodging) to the ALA Annual Conference.

Nominations will be accepted from any member of the American Library Association. Nominating letters must address how the research is relevant to libraries; is creative in its design or methodology; builds on existing research or enhances potential for future exploration; and/or solves an important current problem in the delivery of information resources. A curriculum vita and a copy of several seminal publications by the nominee must be included. Preference will be given to completed research over work in progress. More information and a list of previous winners can be found at

Currently-serving officers and elected officials of LITA, members of the Kilgour Award Committee and OCLC employees and their immediate family members are ineligible.

Send nominations by December 31, 2014, to the Award jury chair:

Tao Zhang
Purdue University Libraries
504 W State St
West Lafayette, IN 47907-4221

The Kilgour Research Award will be presented at the LITA President’s Program on June 29th during the 2015 ALA Annual Conference in San Francisco.

About OCLC

Founded in 1967, OCLC is a nonprofit, membership, computer library service and research organization dedicated to the public purposes of furthering access to the world’s information and reducing library costs. More than 72,000 libraries in 170 countries have used OCLC services to locate, acquire, catalog, lend, preserve and manage library materials. Researchers, students, faculty, scholars, professional librarians and other information seekers use OCLC services to obtain bibliographic, abstract and full-text information when and where they need it. For more information, visit

About LITA

LITA is the leading organization reaching out across types of libraries to provide education and services for a broad membership including systems librarians, library administrators, library schools, vendors and many others interested in leading edge technology and applications for librarians and information providers. For more information, visit, or contact the LITA office by phone, 800-545-2433, ext. 4268; or e-mail:

For further information, contact Mary Taylor at LITA, 312-280-4267.

by Mark Beatty at September 17, 2014 07:17 PM

LITA/Ex Libris Seeking LIS Student Authors

The Library and Information Technology Association (LITA), a division of the American Library Association (ALA), is pleased to offer an award for the best unpublished manuscript submitted by a student or students enrolled in an ALA-accredited graduate program. Sponsored by LITA and Ex Libris, the award consists of $1,000, publication in LITA’s refereed journal, Information Technology and Libraries (ITAL), and a certificate. The deadline for submission of the manuscript is February 28, 2015.

The purpose of the award is to recognize superior student writing and to enhance the professional development of students. The manuscript can be written on any aspect of libraries and information technology. Examples include digital libraries, metadata, authorization and authentication, electronic journals and electronic publishing, telecommunications, distributed systems and networks, computer security, intellectual property rights, technical standards, desktop applications, online catalogs and bibliographic systems, universal access to technology, library consortia and others.

At the time the unpublished manuscript is submitted, the applicant must be enrolled in an ALA-accredited program in library and information studies at the masters or PhD level.

To be eligible, applicants must follow the detailed guidelines and fill out the application form at:

Send the signed, completed forms by February 27, 2015 to the Award Committee Chair,

Sandra Barclay
Kennesaw State University
1200 Chastain Rd NW MD# 0009
Kennesaw, GA 30144-5827.

Submit the manuscript to Sandra electronically at

by February 28, 2015.

The award will be presented at the LITA President’s Program during the 2015 ALA Annual Conference in San Francisco.

About Ex Libris??

Ex Libris is a leading provider of automation solutions for academic libraries. Offering the only comprehensive product suite for electronic, digital, and print materials, Ex Libris provides efficient, user-friendly products that serve the needs of libraries today and will facilitate their transition into the future. Ex Libris maintains an impressive customer base consisting of thousands of sites in more than 80 countries on six continents. For more information about Ex Libris Group visit

About LITA

Established in 1966, LITA is the leading organization reaching out across types of libraries to provide education and services for a broad membership including systems librarians, library administrators, library schools, vendors and many others interested in leading edge technology and applications for librarians and information providers. For more information, visit, or contact the LITA office by phone, 800-545-2433, ext. 4268; or e-mail:

For further information, please contact Mary Taylor at LITA, 312-280-4267.

by Mark Beatty at September 17, 2014 06:54 PM

Jobs in Information Technology: September 17

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

Assistant University Archivist for Public Service, Princeton University/Mudd Manuscript Library, Princeton, NJ

Coordinator, Digital Library Service, Florida Virtual Campus, Gainesville, FL

Database and Metadata Management Coordinator, Fondren Library, Rice University, Houston, TX

Field Engineer Implementation Consultant, EnvisionWare, Inc, San Diego, CA

Implementation Consultant, EnvisionWare,  Inc,  Duluth, GA

Metadata Manager, SAGE Publications, Thousand Oak, CA

Project Analyst, User Experience,  Sage Publications, Thousand Oak, CA

Systems Librarian, Colgate University,  Hamilton, NY

Sr. UNIX Systems Administrator, University Libraries, Virginia Tech Blacksburg,  VI

Technology Development Librarian, Wichita State University Libraries, Wichita, KS

Visit the LITA Job Site for more available jobs and for information on submitting a  job posting.

by vedmonds at September 17, 2014 06:30 PM

Harvard Library Innovation Lab

Link roundup September 17, 2014

Looks like Matt’s been spamming the roundup.

Stack independent magazine subscription service

Innovation and the Bell Labs Miracle

Internet Archive – a short film about accessing knowledge

There’s Finally A Modern Typeface For Programmers

Libraries trusted to keep manuscripts hidden and safe

by Annie at September 17, 2014 05:01 PM

Library of Congress: The Signal

Welcoming the Newest Member of the Viewshare Team to the Library

The following is a guest post by Patrick Rourke, an Information Technology Specialist and the newest member of the Library’s Viewshare team.

patrick-rourke2I made my first forays into computing on days when it was too cold, wet or snowy to walk in the woods behind our house, in a room filled with novels, atlases and other books.  Usually those first programming projects had something to do with books, or writing, or language – trying to generate sentences from word lists, or altering the glyphs the computer used for text to represent different alphabets.

After a traumatic high school exposure to the COBOL programming language (Edsger Dijkstra once wrote that “its teaching should be regarded as a criminal offense” (pdf)), in college I became fascinated with the study of classical Greek and Roman history and literature. I was particularly drawn to the surviving fragments of lost books from antiquity – works that were not preserved, but of which traces remain in small pieces of papyrus, in palimpsests, and through quotations in other works. I spent a lot of my free time in the computer room, using GML, BASIC and ftp on the university’s time sharing system.

My first job after graduation was on the staff of a classics journal, researching potential contributors, proofreading, checking references. At that time, online academic journals and electronic texts were being distributed via email and the now almost-forgotten medium of Gopher. It was an exciting time, as people experimented with ways to leverage these new tools to work with books, then images, then the whole panoply of cultural content.

This editorial experience led to a job in the technical publications department of a research company, and my interest in computing to a role as the company webmaster, and then as an IT specialist, working with applications, servers and networking. In my spare time, I stayed engaged with the humanities, doing testing, web design and social media engagement for the Suda On Line project, who publish a collaborative translation and annotation of the 10th century Byzantine lexicon in which many of those fragments of lost books are found.

My work on corporate intranets and my engagement with SOL motivated me to work harder on extending my programming skills, so before long I was developing web applications to visualize project management data and pursuing a master’s degree in computer science.  In the ten years I’ve been working as a developer, I’ve learned a lot about software development in multiple languages, frameworks and platforms, worked with some great teams and been inspired by great mentors.

I join the National Digital Information Infrastructure and Preservation Program as an Information Technology Specialist, uniting my interests in culture and computing. My primary project is Viewshare, a platform the Library makes available to cultural institutions for generating customized visualizations – including timelines, maps, and charts – of digital collections data. We will be rolling out a new version of Viewshare in the near future, and then I will be working with the NDIIPP team and the Viewshare user community on enhancing the platform by developing new features and new ways to view and share digital collections data. I’m looking forward to learning from and working with my new colleagues at the Library of Congress and everyone in the digital preservation community.

by Butch Lazorchak at September 17, 2014 03:49 PM


Using Intellij IDEA Ultimate as a Dev Environment for Islandora.

In the past I have used Netbeans as my preferred environment for developing Islandora code, I also tried Eclipse and others periodically to see if they had any new must have features. At Drupalcon Portland in 2013 I noticed many of the presenters were using PHPStorm and developers spoke highly of it, so I thought I should give it a try.

Most of the code for Islandora is PHP but some of the opensource projects we rely on are written in Java or something else, so instead of trying out PHPStorm I download a trial of Intellij IDEA Ultimate Edition which has the functionality of PHPStorm (via a plugin) plus support for many other languages and frameworks.

My first impressions of IDEA Utlimate Edition were good. It was quick to load (compared to Netbeans) and the user interface was snappy, there was no lag for code completion etc. I also really liked the Darcula theme which was easy on the eyes. My first impression of IDEA was enough to make me think it was worthwhile to spend a bit more time using it. The more I used it, the more I liked it! I have been using IDEA as my main IDE for a year now.

IDEA has many plugins and supports many frameworks for various languages so initial configuration can take some time, but once you have things configured it works well and runs smoothly. Islandora has strict coding standards, and IDEA is able to help with this; we are able to point it at the same codesniffer configuration that the Drupal Coder module uses. IDEA then highlights anything that does not conform to the configured coding standards. It will also fix a lot of the formatting errors if you choose to reformat the code. The PHP plugin also has support for Mess Detector, Composor etc.

I also like the PHP debugger in IDEA. You can have several different configurations setup for various projects. While the debugger is a useful tool, I have run into some situations where it opens a second copy of a file in the editor, which can cause issues if you don't notice.

You can also open an ssh session within IDEA which is great for running stuff like git commands. The editor does have built in support for git and svn etc. but I prefer to use the command line for this and in Intellij I can do this while still in the IDE.

IDEA has good support for editing xml files and running/debugging transforms within the IDE.

Overall, Intellij IDEA Ultimate is definitely worth trying! It is a commercial product so you'll have to be prepared to buy a license after your trial. However, they do have a free community edition; be sure to check whether it supports PHP. Most of the functionality I discussed here is also available in PHPStorm which is cheaper but it doesn't support languages other than PHP, HTML etc. If you are part of an opensource project you can apply for an opensource license (Islandora has one), if you qualify you may get a free license.

by ppound at September 17, 2014 02:48 PM

District Dispatch

It must be “FCC month” at the ALA

FCC BuildingWell, yes, almost any month could be “FCC month” with the number of proceedings that affect libraries and our communities, but September has been particularly busy. Monday we entered the next round of E-rate activity with comments in response to the Federal Communication Commission’s Further Notice of Proposed Rulemaking (emphasis added), and closed out a record-setting public comment period in relation to promoting and protecting the Open Internet with two public filings.

I’ll leave it to Marijke to give the low-down on E-rate, but here’s a quick update on the network neutrality front:

ALA and the Association of College & Research Libraries (ACRL) filed “reply” comments with a host of library and higher education allies to further detail our initial filing in July. We also joined with the Center for Democracy & Technology (CDT) to re-affirm that the FCC has legal authority to advance the Open Internet through Title II reclassification or a strong public interest standard under Section 706. This work is particularly important as most network neutrality advocates agree the “commercially reasonable” standard originally proposed by the FCC does not adequately preserve the culture and tradition of the internet as an open platform for free speech, learning, research and innovation.

For better or worse, these filings are just the most recent milestones in our efforts to support libraries’ missions to ensure equitable access to online information. Today the FCC is beginning to hold round tables related to network neutrality (which you can catch online at ALA and higher education network neutrality counsel John Windhausen has been invited to participate in a roundtable on October 7 to discuss the “Internet-reasonable” standard we have proposed as a stronger alternative to the FCC’s “commercially reasonable” standard.

The Senate will take up the issue in a hearing today, including CDT President and CEO Nuala O’Connor. And a library voice will again be included in a network neutrality forum—this time with Sacramento Public Library Director Rivkah Sass speaking at a forum convened by Congresswoman Doris Matsui on September 24. Vermont State Librarian Martha Reid testified at a Senate field hearing in July, and Multnomah County Library Director Vailey Oehlke discussed network neutrality with Senator Ron Wyden at part of an event in May.

This month ALA also filed comments in support of filings from the Schools, Health and Libraries Broadband (SHLB) Coalition, State E-rate Coordinators Alliance (SECA) and NTCA—the Broadband Coalition calling for eligible telecommunications carriers (ETCs) in the Connect America Fund to connect anchor institutions at higher speeds than those delivered to residents. Going further, ALA proposes that ETCs receiving CAF funding must serve each public library in its service territory at connection speeds of at least 50 Mbps download and 25 Mbps upload. Access and affordability are the top two barriers to increasing library broadband capacity, so both the Connect America Fund and the E-rate program are important components of increasing our ability to meet our public missions. AND we presented at the Telecommunication Policy Research Conference! Whew.

Buckle your seat belts and stay tuned, because “FCC Month” is only half over!

The post It must be “FCC month” at the ALA appeared first on District Dispatch.

by Larra Clark at September 17, 2014 02:26 PM


More than 148,000 items from the U.S. Government Printing Office now discoverable in DPLA

We were pleased to share yesterday that nearly 60,000 items from the Medical Heritage Library have made their way into DPLA, and we’re now doubly pleased to share that more than 148,000 items from the Government Printing Office’s (GPO) Catalog of U.S. Government Publications (CGP) are now also available via DPLA.

To view the Government Printing Office in DPLA, click here.

Notable examples of the types of records now available from the GPO include the Federal Budget, laws such as the Patient Protection and Affordable Care Act, Federal regulations, and Congressional hearings, reports, and documents. GPO continuously adds records to the CGP which will also be available through DPLA, increasing the discoverability of and access to Federal Government information for the American public.

“GPO’s partnership with DPLA will further GPO’s mission of Keeping America Informed by increasing public access to a wealth of information products available from the Federal Government,” said Public Printer Davita Vance-Cooks. “We look forward to continuing this strong partnership as the collection of Government information accessible through DPLA continues to grow”.

GPO is the Federal Government’s official, digital, secure resource for producing, procuring, cataloging, indexing, authenticating, disseminating, and preserving the official information products of the U.S. Government. The GPO is responsible for the production and distribution of information products and services for all three branches of the Federal Government, including U.S. passports for the Department of State as well as the official publications of Congress, the White House, and other Federal agencies in digital and print formats. GPO provides for permanent public access to Federal Government information at no charge through our Federal Digital System (, partnerships with approximately 1,200 libraries nationwide participating in the Federal Depository Library Program, and our secure online bookstore. For more information, please visit

To read the full GPO press release announcing its partnership with DPLA, click here.

cc-by-iconAll written content on this blog is made available under a Creative Commons Attribution 4.0 International License. All images found on this blog are available under the specific license(s) attributed to them, unless otherwise noted.

by DPLA at September 17, 2014 01:50 PM

Andromeda Yelton

jQuery workshop teaching techniques, part 2: techniques geared at affective goals

I’m writing up what I learned from teaching a jQuery workshop this past month. I’ve already posted on my theoretical basis and pacing. Today, stuff I did to create a positive classroom climate and encourage people to leave the workshop motivated to learn more. (This is actually an area of relative weakness for me, teaching-wise, so I really welcome anyone’s suggestions on how to cultivate related skills!)

Post-it notes

I distributed a bunch of them and had students put them on their laptops when they needed help. This lets them summon TAs without breaking their own work process. I also had them write something that was working and something that wasn’t on post-its at the end of Day 1, so I could make a few course corrections for Day 2 (and make it clear to the students that I care about their feedback and their experience). I shamelessly stole both tactics from Software Carpentry.

Inclusion and emotion

The event was conducted under the DLF Code of Conduct, which I linked to at the start of the course material. I also provided Ada Initiative material as background. I talked specifically, at the outset, about how learning to code can be emotionally tough; it pushes the limits of our frustration tolerance and often (i.e. if we’re not young, white men) our identity – “am I the kind of person who programs? do people who program look like me?” And I said how all that stuff is okay. Were I to do it over again, I’d make sure to specifically name impostor syndrome and stereotype threat, but I’ve gotten mostly good feedback about the emotional and social climate of the course (whose students represented various types of diversity more than I often see in a programming course, if less than I’d like to see), and it felt like most people were generally involved.

Oh, and I subtly referenced various types of diversity in the book titles I used in programming examples, basically as a dog-whistle that I’ve heard of this stuff and it matters to me. (Julia Serano’s Whipping Girl, which I was reading at the time and which interrogated lots of stuff in my head in awesome ways, showed up in a bunch of examples, and a student struck up a conversation with me during a break about how awesome it is. Yay!)

As someone who’s privileged along just about every axis you can be, I’m clueless about a lot of this stuff, but I’m constantly trying to suck less at it, and it was important to me to make that both implicit and explicit in the course.

Tomorrow, how ruthless and granular backward design is super great.

by Andromeda at September 17, 2014 01:30 PM


Browser Developer Tools

Despite what the name may imply, browser developer tools are not only useful for developers. Anyone who works with the web (and if you are reading this blog, that probably means you) can find value in browser developer tools because they use the browser, the tool we all use to access the riches of the web, to deconstruct the information that makes up the core of our online experience. A user who has a solid grasp on how to use their browser’s developer tools can see lots of incredibly useful things, such as:

The first step in understanding your browser’s developer tools is knowing that they exist. If you can only get to this step, you are far ahead of most people. Every browser has its own set of embedded developer tools, whether you are using Internet Explorer, Safari, Firefox, Chrome, or Opera. There’s no special developer version of the browser to install or any add-ons or extensions to download, and it doesn’t matter if you are on Windows, Mac or Linux. If a computer has a browser, it already has developer tools baked in.

The next step on the journey is learning how to use them. All browser developer tools are pretty similar, so skills gained in one browser translate well to others. Unfortunately the minor differences are substantial enough to make a universal tutorial impossible. If you have a favorite browser, learn how to activate the various developer tools, what each one can do, how to use them effectively, and how to call them with their own specific keyboard shortcut (learning to activate a specific tool with a keyboard shortcut is the key to making them a part of your workflow). Once you have a solid understanding of the developer tools in your favorite browser, branch out and learn the developer tools for other browsers as well. After you have learned one, learning others is easy. By learning different sets of developer tools you will find that some are better at certain tasks than others. For instance, (in my opinion) Firefox is best-in-class when dealing with CSS issues, but Chrome takes first place in JavaScript utilities.

Firefox 3D viewGoogle search results using Firefox’s 3D view mode, which shows a web page’s nested elements as stacks of colored blocks. This is incredibly helpful for debugging CSS issues.

Another great reason to learn developer tools for different browsers has to do with the way browsers work. When most people think of web programming, they think of the server side versions of files because this is where the work is done. While it’s true that server side development is important, browsers are the real stars of the show. When a user requests a web page, the server sends back a tidy package of HTML, CSS and JavaScript that the browser must turn into a visual representation of that information. Think of it like a Lego kit; every kid buys the same Lego kit from the store which has all the parts and instructions in a handy portable package, but it’s up to the individual to actually make something out of it and often the final product varies slightly from person to person.  Browsers are the same way, they all put the HTML, CSS and JavaScript together in a slightly different way to render a slightly different web page (this causes endless headaches for developers struggling to make a consistent user experience across browsers). Browser developer tools give us an insight into both the code that the browser receives and the way that the individual browser is putting the web page together. If a page looks a bit different in Internet Explorer than it does in Chrome, we can use each browser’s respective developer tools to peek into the rendering process and see what’s going on in an effort to minimize these differences.

Now that you know browser developer tools exist and why they are so helpful, the only thing left to do is learn them. Teaching you to actually use browser developer tools is out of the scope of this post since it depends on what browser you use and what your needs are, but if you start playing around with them I promise you will find something useful almost immediately. If you are a web developer and you aren’t already using them, prepare for your life to get a lot easier. If you aren’t a developer but work with web pages extensively, prepare for your understanding of how a web page works to grow considerably (and as a result, for your life to get a lot easier). I’m always surprised at how few people are aware that these tools even exist (and what happens when someone stumbles upon them without knowing what they are), but someone with a solid grasp of browser developer tools can expose a problem with a single keyboard shortcut, even on someone else’s workstation. A person who can leverage these tools to figure out problems no one else can often acquires the mystical aura of an internet wizard with secret magic powers to their relatively mortal coworkers. Become that person with browser developer tools.

by Bryan Brown at September 17, 2014 01:00 PM

Ed Summers

Google’s Subconscious

Can a poem provide insight into the inner workings of a complex algorithm? If Google Search had a subconscious, what would it look like? If Google mumbled in its sleep, what would it say?

A few days ago, I ran across these two quotes within hours of each other:

So if algorithms like autocomplete can defame people or businesses, our next logical question might be to ask how to hold those algorithms accountable for their actions.

Algorithmic Defamation: The Case of the Shameless Autocomplete by Nick Diakopoulos


A beautiful poem should re-write itself one-half word at a time, in pre-determined intervals.

Seven Controlled Vocabluaries by Tan Lin.

Then I got to thinking about what a poem auto-generated from Google’s autosuggest might look like. Ok, the idea is of dubious value, but it turned out to be pretty easy to do in just HTML and JavaScript (low computational overhead), and I quickly pushed it up to GitHub.

Here’s the heuristic:

  1. Pick a title for your poem, which also serves as a seed.
  2. Look up the seed in Google’s lightly documented suggestion API.
  3. Get the longest suggestion (text length).
  4. Output the suggestion as a line in the poem.
  5. Stop if more than n lines have been written.
  6. Pick a random substring in the suggestion as the seed for the next line.
  7. GOTO 2

The initial results were kind of light on verbs, so I found a list of verbs and randomly added them to the suggested text, occasionally. The poem is generated in your browser using JavaScript so hack on it and send me a pull request.

Assuming that Google’s suggestions are personalized for you (if you are logged into Google) and your location (your IP address), the poem is dependent on you. So I suppose it’s more of a collective subconscious in a way.

If you find an amusing phrase, please hover over the stanza and tweet it — I’d love to see it!

by ed at September 17, 2014 11:50 AM