Code4Lib 2015

Catalog Pull Platform & Projects

Pull vs. Push Demand

Instead of trying to perfectly predict how patrons, staff, and institutional administration's demands for information resources and services, the catalog pull platform focuses on building fast, loosely coupled networked web tools that are iteratively developed using processes and techniques from AGLE and Lean startup's Build-Measure-Learn cycles (Ries, 2011).

In the Catalog Pull Platform is that sophisticated tools emerge over time starting with a Minimum Viable Product that is then incrementally improved with direct testing and feedback from the participants in the pull platform.

Critical to pull platforms is the idea that new ideas and experiments are encouraged with negligible barriers for use by participants and that these new ideas and processes are easily communicated to other participants in the platform.

Code4Lib 2015

In the fall of 2014, Jeremy Nelson and Aaron Schmidt of Influx Library User Experience, successfully bid on a Library of Congress contract for a BIBFRAME Search and Display System. This new system - called BIBFRAME Catalog for short - is built from loosely coupled components pulled from the Catalog Pull Platform. The BIBFRAME Catalog can be run as standalone Flask application, or more likely, used as a Flask Blueprint in an institution's own catalog or website.

BIBFRAME Catalog & Datastore

Library of Congress

BIBFRAME is the Library of Congress's multi-year effort to replace the venerable MARC21 format with a Linked Data-based vocabulary made up of four fundamental classes: Works, Instances, Annotations, Authorities.

Search Display System RFQ

In August 2014, the Library of Congress issued a RFQ for

...a working demonstration of a BIBFRAME search and display tool for community stakeholders and to make the resulting code available for free download so that developers can experiment with and build on the tool.

Influx Library User Interface Design

Influx Library User Experience is a full service design and development firm committed to helping make great libraries. Aaron Schmidt of Influx has presented internationally on the topics of library user experience and website usability. He currently is a lecturer at the San Jose State University School of Library and Information Science, and is a columnist for Library Journal.


The BIBFRAME Catalog project is in its starting its second iteration of Build-Measure-Learn loop.


Flask Microframework

The open-source Flask web microframework is the middleware that connects the rich HTML5/Javascript web catalog front-end with the BIBFRAME Datastore that stores and manages bibliographic metadata through the Semantic Server REST API.

BIBFRAME Datastore

The BIBFRAME Datastore provides a REST API, file-system storage, configuration, and other utilities for managing Linked Data entities in the BIBFRAME Catalog. The REST API extends the Semantic Server's REST APIs for analytics, CRUD, and machine learning operations on the underlying Fedora 4, Elastic Search, and Fuseki services.

Code4Lib 2015

The Semantic Server is an open-source REST API wrapper that seamlessly interfaces to RDF entities stored as subject graphs in Fedora 4, expanded and enriched search through Elastic Search, supported by a full SPARQL endpoint using Fuseki.

Fedora Commons

The core of the semantic server is Fedora Commons version 4 Repository. Fedora's primary use in the Semantic Server is as a Linked Data Platform. Linked Data is stored and managed as subject RDF graphs in Fedora 4, with the most work done with using BIBFRAME and vocabularies.

Elastic search

Elastic Search & Fuseki

Elastic Search is used for real-time search by indexing the JSON-LD serialization of the RDF subject graphs stored in Fedora. All BIBFRAME resources, including Works, Authorities, Annotations, and Instances are indexed into Elastic Search using their JSON-LD serialization. The BIBFRAME RDF graphs are retrieved from the Fedora repository giving the Semantic Server the ability to leverage the rich Elastic Search ecosystem.

Apache Fuseki, a full-featured SPARQL endpoint enhances the functionality and utility of Fedora 4 as a Linked Data platform for such uses in the BIBFRAME Catalog as Resource de-duplication. Fuseki and Fedora 4 also allow for inference querying and enhancing the possibilities of sharing linked-data resources at external sources like the Library of Congress, OCLC, OpenLibrary, and the Libhub initiative for shared BIBFRAME resources.



Falcon is an open-source Python framework for building Cloud APIs following a REST architectural style. Falcon is sponsored and used by Rackspace. The Semantic Server uses Falcon to build a REST API for consistent and simple interface to Fedora, Elastic Search, and Fuseki.


Next steps

After the standalone version of the BIBFRAME Catalog and Datastore is complete, the semantic server will expand to include more analytic and cache support for tracking, analyzing, and visualizing the use and usage of BIBFRAME and other resources.

Using the Semantic Server

Colorado College's TIGER Catalog

As an ongoing experiment for replacing Colorado College's legacy ILS, the first TIGER Catalog Minimum Viable Product was released in 2014 using Aaron Schmidt's design outlined in a blog posting. This first iteration uses Flask, Solr, with all of Colorado College's MARC21 records serialized as JSON documents in MongoDB.

Current TIGER development is pivoting to the Semantic Server for improving the operations and analytics for assessment, discoverable, and accessibility of Colorado College Library's collections stored currently as MARC records, Fedora 3 digital repositories, and a large number of journal databases and resources.

Website Github

Islandora eBadges

A Flask application being developed for the foundation to issue Mozilla Open Badges for camps, special projects, and other types of educational and training events. Badge information and individually issued badge images are stored in an embedded semantic server using and OpenBadges linked data.


Code4Lib 2015

What is a Pull Platform?

John Hagel III's and John Seely Brown's initial conception of a pull platform comes from a 2008 article they later expanded in their 2010 book, The Power of Pull. The fundamental idea of a pull platform, is that instead of extensive predication and planning, the resources and services in the pull platform are directly "pulled" from the needs of users and active participants. For the Catalog Pull Platform staff, librarians, library administration, commercial entities, and algorithms are these sources of pull.

What is Linked Data?

Linked Data is about making relationships between our data more explicit in machine-readable formats. A Linked Data statement, called a RDF Triple, is usually in the format of subject predicate object.

What we think of as "bibliographic records" are, in a linked-data context, is just a collection of statements that follow this basic RDF triple model. Any subject, predicate, or object can be a string, an URI, or a blank node. A collection of these triples makes up a RDF graph.

Do you have a shameless book plug?

Yes, Jeremy Nelson and Chandos Publishing are in post-production on a book, Becoming a Lean Library: Lessons from the World of Start-ups that goes into much more detail on the Catalog Pull Platform, Build-Measure-Learn loops, and other lean manufacturing and lean start-up ideas and concepts as applied to libraries and other cultural heritage organizations.

Code4Lib 2015


This presentation was developed with Flask and with a modified design from Zerotheme. All original content and graphics in this lightning talk are licensed under the CC BY 4.0, all source code is licensed under the GPLv4 with the repository hosted on GitHub at

All logos, trademarks, and an associated media are owned by the respective organizations and individuals. Any use in this lightning talk is assumed to be fair use and not used for a commercial purpose.

Contact Form

Signup for the Catalog Pull Platform email newsletter