bioCADDIE 2018 workshop

 

9:00 am - Data Discovery Index (DDI) and the ecosystem for biomedical data sharing

Lucila Ohno-Machado, University of California San Diego

In this session, we will contextualize the work of the bioCADDIE consortium in light of various initiatives to enable fair use of data by a large community of users. We will also highlight the many parts that are necessary for a complex engine that performs data ingestion and “digestion” (i.e., transformation of metadata into a common format), gets free text queries and outputs a ranked list of datasets. We will discuss the involvement of a large community of collaborators and several initiatives that were designed to promote its engagement. We will then present each component of the DDI in light of related work: (A) Obtaining metadata from repositories; (B) Developing, implementing, and refining a minimal metadata standard; (C) Building a Data Search Engine. Finally, we will discuss the implications of the proof-of-concept DDI we developed, the social barriers for adoption of data sharing environments and lessons learned during 3.5 years of activities.

9:30 am - Data Discovery Index (DDI) and the ecosystem for biomedical data sharing

Discussion Time

9:45 am - Gathering metadata on the Web

Jeff Grethe, University of California San Diego

  • Choice of repositories
  • Data ingestion
    • Metadata typically provided by repositories.
    • Unique identifiers and duplicated data sets within and across repositories
    • Updates in datasets and the use of crawlers for detecting changes
  • Challenges and opportunities
    • Which approaches worked best
  • Recommendations moving forward

10:15 am - bioCADDIE, CEDAR and Commons Credit Pilot

Jeff Grethe, University of California San Diego

Discussion and demo of the joint efforts between  bioCADDIE, CEDAR and the Commons Credit Pilot teams.
 

10:45 am
BREAK
 

11:00 am - DATS: minimal metadata model for datasets (Part 1)

Philippe Rocca-Serra, Oxford University

  • Origin of the Data Tag Suite (DATS). Overview of related models.
  • Core and optional DATS elements
  • Data types and ontologies
    • Subject Resource Annotation Ontology
    • Domain Resource Annotation Ontology
  • Challenges and opportunities for continuous improvement of DATS; next steps

11:30 am - DATS: minimal metadata model for datasets (Part 2)

George Alter, University of Michigan and Hua Xu, University of Texas Health Sciences 

 

12:15-1:05 PM
LUNCH BREAK


1:05 pm - Panel on DATS

Facilitator: George Alter
Panel Members: Susanna Sansone, Oxford University; Matt McAuliffe, NIH; Kevin Read, NYU Health Sciences Library

1:45 pm - DATS: minimal metadata model for datasets

Discussion Time

2:00 pm - DataMed: a data search engine

Hua Xu, University of Texas Health Sciences

  • Query expansion from free text
  • Search, filters
  • Links to the literature, data citations
  • Ranking algorithms and strategies employed by ranking contestants
  • User interfaces and usability testing

2:30 pm - DataMed Demo

Hua Xu, University of Texas Health Sciences


3:00 pm
BREAK
 

3:15 pm - Lessons learned and future directions

Lucila Ohno-Machado, University of California San Diego Throughout the meeting we will engage the audience in discussing social aspects, organization, and future of data discovery activities. In the final session we will summarize the discussions and discuss the final publication.


4:00 pm
END