Data Curation Expert

There are many different career possibilities, other than librarian, for those who hold a Master of Library Science (MLS) or Master of Library and Information Science (MLIS) degree. One of these career options is data curation expert. Data services are growing in libraries, particularly academic libraries, with the increasing amount of data available. Several MLS and MLIS programs now offer specializations in Data Curation, providing educated and trained workers to fill the need for data support services in the workplace. Data curation is a growing area of employment need in the private and public sectors. 

Librarian holding a stack of books

What is data curation? For data to be accessible to search, retrieve and analyze, specialized data curatorial actions must be taken. These actions are designed to prepare the data for reuse, discoverability, file transformations into archival formats, and selection of license or copyright. This is a very important position and one that combines library knowledge with data knowledge. If you want to learn more about becoming a data curation expert, read on. 

What Do Data Curation Experts Do?

Data curators collect data from many sources and manage that data to make it more useful for users. Data curation focuses on the maintenance and management of metadata, rather than of the database itself. Data curation relies heavily on the usage popularity of articles or services. Data curators manage and maintain data but also are involved in determining best practices for working with data. It is a bridge between the world of library science, information technology, and data science. The job of the data curator is highly important, as data must be catalogued and curated correctly before it can be used. 

Data curators work with metadata, which is essentially descriptive data offering information about other data. Metadata is a small amount of information used in a cataloging system to provide basic information about the data, making that data easier to find and track. Data curation experts use data dictionaries to create metadata repositories. Data dictionaries usually include attribute names (specifications defining a feature of an object), optional/required information before a record can be saved, and attribute types (of data allowed in a field).

Data catalogs are a way to organize metadata, acting as both a search engine and a server program allowing users to collaborate in creating content (such as a wiki). Data catalogs index data systems. Data curators take the organization of metadata even higher, working with data dictionaries and data catalogs. Curators must fully understand the systems that store the data, as well as the tools used for processing the data. 

Data curators organize the data so that the IT department staff, data engineers and data scientists can all work with the data. 

What Tools Do Data Curation Experts Use?

Data curation experts use a variety of tools in performing their work, such as:

  • Digital curation resources – which are a catalog of tools for data creators and digital curators
  • DCC tools – a collection of curation and data management tools
  • OpenRefine – a free, open-source tool used for working with complicated data and transforming formats, extending that data to the internet, and linking it to other databases
  • The DMPTool – free, open-source online application that creates data management plans
  • Qualitative Data Repository – curates, preserves, publishes and promotes the download of digital data in social sciences
  • re3data.org – a way to access and share data with 2000 research data repositories

Where Do Data Curation Experts Work?

Data curation experts may work for a business, hospital or other organization, organizing its massive amounts of data. They manage data as objects and provides an option for the storage of unstructured data. Examples of places where data curation experts may be employed include corporations, academic facilities, pharmaceutical companies, biomedical technology companies, technology companies, government agencies, and more. 

Education for Data Curation Experts

Becoming a data curation expert begins with earning a MLIS or MLS degree. Optimally, you should choose a program which offers data curation as a specialization. There are just a few universities currently offering such programs, including:

  • University of Illinois – Champaign-Urbana
  • University of Maryland
  • Rutgers University
  • University of North Carolina-Chapel Hill 
  • University of Arizona
  • University of Michigan
  • Simmons College

Specialized coursework taken within a MLIS or MLS program with data curation as a concentration may include subjects like:

  • Metadata theory
  • Digital preservation
  • Systems analysis and management
  • Digital libraries
  • Information modeling
  • Ontology development
  • Information organization
  • Information storage and retrieval
  • Data management
  • Information policy 

Most data curation professionals working in the field say that their MLIS or MLS education gave them a solid foundation, but that many of the skills they needed to perform as data curation experts were then learned on the job. 

Certification for Data Curation Experts

The SAS Academy for Data Science offers certification for data curation experts. The Data Curation Professional (Dc) credential requires completion of four online training courses. These courses are:

  • Introduction to Data Curation for SAS Data Scientists
  • SAS Data Management Tools and Applications
  • SAS and Hadoop
  • Additional SAS Data Management Tools and Applications

SAS Data Curation Professional certification is not necessary in order to become a data curation expert, but some employers may value the credential. 

What Does Data Curation Do for the Data Industry?

Without data curators, data would be floating around in cyberspace, unusable and jumbled. Data curation enhances the long-term value of data, making it available for higher quality research. It helps to make machine learning more effective, as it allows humans to add knowledge to what a machine has automated. It helps deal with large, disorganized data swamps, giving that data value. It educates users through providing curated information for them to access. It also guarantees that data is of high quality, ensuring its long-term preservation and retention. Data curation is essential to speeding up innovation in organizations through opening and socializing how data is used. 

Jobs in Data Curation

A recent survey of the Internet found some data curation experts jobs at various locations and settings throughout the United States. Some of them include:

  • Data Curator- Sage Bionetworks, Seattle, WA – starting salary $74,000 to $114,000 annually
  • Data Curator – University of Pennsylvania, Philadelphia, PA- starting salary $50,684 to $138,391 annually
  • Scientific Data Curator – Axle Informatics, Rockville, MD – starting salary $80,000 to $90,000 annually
  • Data Curation Specialist – University of Illinois at Urbana-Champaign- starting salary $63,572 annually