Abstract

T he notion that all scientific software should be open-source and free has been actively promoted in recent years, mostly from the top down via mandates from funding agencies 1 but occasionally from the bottom up, as exemplified by a recent Viewpoint in this journal. 2 A commonly articulated rationale is that the results of scientific research funded by government grants should be free for society and that the scientific community benefits from free access. The purpose of this Viewpoint is to examine the consequences of these opinions. What Is Scientif ic Sof tware? Modern computational chemistry software is an extremely complex product based on advanced scientific ideas (models and theories) and sophisticated algorithms that transform these ideas from equations into useful tools. The development of practical software that can be used by nonexperts to solve contemporary research problems requires considerable technical effort to produce and maintain robust, efficient, and validated code. Unlike the development of, for example, a smart-phone app, where the code base is small 3 and a relatively large community can easily write extensions and add-ons, production of scientific software involves the curation of millions of lines of source code. The complexity of this code demands long-term user and developer support to maintain its integrity and performance while keeping up with new computer architectures, fixing bugs, and adding features. Recognizing the importance of these ideas, various funding agencies in the U.S. have made " sustainable software " a key priority in the distribution of research support. 1 Sustainability is a critical goal, but one that can be realized in various ways. Good Sof tware Is Important to Science. Computational chemistry software is an essential scientific instrument that facilitates discovery and innovation far beyond the laboratories in which it is created, an achievement that was recognized by the 1998 and 2013 Nobel Prizes in Chemistry. 4 Focusing on quantum chemistry software in this Viewpoint, we note that today any chemist can (with very little training) use numerous quantum chemistry programs as teaching and research tools that aid in the design and interpretation of experiments. A software package should be more than just a tool for end users, however; it should also be a platform to develop and test new models and algorithms. Maintaining a code base requires extensive validation, and given the complexity of modern computational methods, even testing of " pilot code " or a " proof-of-principle " implementation requires access to basic software infrastructure, for example, an integrals library, a self-consistent field procedure, efficient I/O and memory manage-ment, tools for manipulating tensors, and so forth. Modularity is a laudable goal, but in reality, " interoperability " often comes at the expense of performance. In high-performance codes, the aforementioned components are tightly interwoven, to the extent that expert help is often required to modify key components or to develop nonstandard interfaces to them. As such, the ability to innovate along either applied or theoretical lines depends crucially on the quality of the software and the availability of documentation and expert support. As examples, consider two widely used electronic structure programs, Q-CHEM 5 and MOLPRO. 6 These codes consist of ∼5.5 and ∼2.5 million lines of source code, respectively, written in multiple languages and each in continuous development over several decades. Q-CHEM incorporates scientific advances reported in more than 300 peer-reviewed scientific publications, whereas methods implemented for the first time in MOLPRO have led to 20 high-impact papers that have each been cited over 300 times. Neither code is static: more than 70 scientists are actively contributing to MOLPRO, and the Q-CHEM developer base numbers more than 100. Such agile innovation comes at a price, however. Significant effort is required to keep the code robust, efficient, and sound and to provide the documentation that ensures the usability of new methods and the extensibility of older ones. Software from academia is often developed with an emphasis on ideas rather than implementation, fed by the need for timely peer-reviewed journal publications that provide ongoing grant support and future jobs for graduate students. To bring new ideas to the production level, with software that is accessible to (and useful for) the broader scientific community, contribu-tions from expert programmers are required. These technical tasks usually cannotand generally should notbe conducted by graduate students or postdocs, who should instead be focused on science and innovation. To this end, Q-CHEM employs four scientific programmers. Other quantum chemistry codes (e.g., MOLPRO, 6 TURBOMOLE, 7 JAGUAR, 8 MOLCAS, 9 PQS,

Links and resources

Tags

community

  • @unibiblio
  • @theochem
@theochem's tags highlighted