Programmatic Compilation of Chemical Data and Literature from PubChem using MATLAB
DOI:
https://doi.org/10.18260/2-1-370.660-115508Abstract
MATLAB live scripts are useful for reproducible programmatic compilation of chemical data and literature. In this article, we use a combination of the PubChem PUG REST Application Programming Interface (API), Structured Data Query (SDQ) agent, and text extraction with MATLAB live scripts that allow programmatic PubChem similarity searching, SMARTS substructure queries, literature searching, compound-based bibliometric data compiling, and SDfile data extraction. All MATLAB live scripts are openly available and adaptable with minimal modification to the script code. We discuss how these live scripts can increase scientific reproducibility and be integrated into chemistry and chemical engineering education.