Public Release of Metadata Extraction and Conversion Tools for Caracterization Meaurement Data with Machine-readability via M-DaC

— For Accelerating Materials Data Sciences—

2019.01.30


National Institute for Materials Science

The MaDIS Division's Materials Data Platform Center, NIMS, teamed up with two materials characterization instrument manufacturers (ULVAC-PHI, Inc. and Rigaku Corporation) to develop tools to extract a set of metadata as XML files. These tools are expected to enable efficient generation and accumulation of AI- and machine learning-ready data and promote the use of data science in materials development.

Abstract

  1. The MaDIS Division’s Materials Data Platform Center, NIMS, teamed up with two materials characterization instrument manufacturers (ULVAC-PHI, Inc. and Rigaku Corporation) to develop tools to extract a set of metadata as XML files. These tools are expected to enable efficient generation and accumulation of AI- and machine learning-ready data and promote the use of data science in materials development.
  2. Data-driven materials development involves the statistical processing of materials data using machine learning to search and develop new materials. It is attracting a great deal of interest. However, measurement records that need to be statistically processed are often stored in different formats, even when collected from instruments made by the same manufacturers, making data comparison difficult. In addition, because data files often lack metadata, such as a part of measurement conditions, it can be difficult to adopt them for machine learning process. Thus, demand has been high for the development of tools capable of converting raw data into a machine readable type.
  3. In collaboration with characterization instrument manufacturers, we developed tools that enable computers to read metadata based on specific terminology conversion rules and extract primary parameters. These tools were developed for two measurement techniques commonly used in the evaluation of materials: X-ray photoelectron spectroscopy (XPS) data and X-ray diffraction (XRD) data. The first version of the these tools is capable of processing XPS spectral and depth profiling data generated by ULVAC-PHI’s Quantera SXM and other instruments and powder XRD pattern data generated by Rigaku’s SmartLab diffractometers. We intend to increase the compatibility of the tools with a wider range of raw data formats in addition to these ones, including data generated by other characterization instrument manufacturers, other technique instruments and other science and engineering fields.
  4. In addition to the metadata extraction tools described above, we also developed a binary-to-text conversion tools and visualization tools with a matrix parser program capable of converting spectral data, etc. into graphs and images. These tools were combined and named “M-DaC” (materials data conversion tools) which will be released through the NIMS-DPFC website. Users are permitted to modify some of the program’s source codes under the MIT License. Samples of raw data generated by intended instruments will also be made available for public use under a Creative Commons Attribution-Noncommercial 4.0 International (CC BY-NC 4.0) license.
    M-DaC accessible at https://www.nims.go.jp/MaDIS/about/M-DaC.html
  5. We gave presentations on M-DaC at “nano tech 2019” and at the MaDIS symposium: Accelerating materials development using AI and data platform strategies, both of which were held at Tokyo Big Sight starting on January 30, 2019.

Related Image

Contacts

(Regarding this research)

Hideki Yoshikawa
Deputy Managing Director
Materials Data Platform Center
Research and Services Division of Materials Data and Integrated System (MaDIS)
National Institute for Materials Science
TEL: +81-29-859-2451
E-Mail: YOSHIKAWA.Hideki=nim.go.jp
(Please change "=" to "@")

(For general inquiries)

Public Relations Office
National Institute for Materials Sciences
Tel: +81-29-859-2026
Fax: +81-29-859-2017
E-Mail: pressrelease=ml.nims.go.jp
(Please change "=" to "@")