Ansel Adams is famous for capturing iconic black and white photographs of Yosemite Park and the American West during the early and mid 1900s. In those early days of photography, landscape photographers needed to be hardy. To capture his seminal Monolith image, Ansel climbed 2500 feet of elevation in the snow carrying a large format camera and wooden tripod. In addition to such physical stamina, photographers needed to possess a deep and intuitive mental understanding of how their camera inputs – shutter speed, focal length, film type, and lens filters – achieved the final output – exposure, sharpness, depth of field, and contrast. Ansel had only two of his large glass plate negatives remaining when he reached the spot to photograph the Monolith. Without the luxury of infinite shots or digital previews, he had to previsualize all the complex interactions between the outside light and his camera equipment before opening the shutter.
Today, most photographs at Yosemite are likely taken by cell phones carried in a pocket along well-marked trails. All the decisions of the photographic process – shutter speed, white balance, ISO speed – are made by the camera. After the capture, one can select amongst dozens of predesigned post-processing treatments to get the right “look” – perhaps a high-contrast black and white reminiscent of Adams. Today, one can be a photographer whilst knowing almost nothing about the photographic process. Even today’s professionals offload many of the details of determining exposure and white balance to their cameras, taking advantage of the convenience and opportunities that digital photography offers.
Computational materials science is now following a similar trajectory to that of photography. In the not too distant past, one needed to know both DFT theory and the details of its implementation to have any hope of computing an accurate result. Computing time was limited, so each calculation setting was thought through and adjusted manually before “opening the shutter” and performing the calculation.
These days, DFT software packages offer useful default settings that work over a wide range of materials. In addition, software “add-ons” are starting to promise essentially “point-and-compute” technology – that is, give the software a material and a desired property and it will figure out all the settings and execute the calculation workflow. Performing a DFT computation is still not as easy as snapping an iPhone picture, but it is becoming more and more accessible to experimental groups and those sitting outside theory department walls.
But, along with these new steps forward in automation, might we be losing something as well?
Often one of the first victims of increased convenience is quality of the final product. The large format negative used by Ansel Adams to capture “The Monolith” was likely capable of recording over 500 megapixels of detail, much more than today’s highest end digital camera (and a far cry from Instagram snapshots).
In the case of DFT, both manual and semi-automatic computations employ the same backend software packages, so it is not as if the automatic “cameras” are of poorer quality. Careful manual control of the calculation settings still yields the most precise results. However, algorithms for fully automatic calculation parameters are growing in sophistication and may someday be embraced almost universally, perhaps in the same manner as the automatic white balance feature in cameras.
More controversial, perhaps, is the rise of automated post-processing routines to “correct” the raw computed data in areas of known DFT inaccuracy. Such techniques are how cell phone cameras provide good images despite having poor sensors and lenses: post-processing algorithms reduce noise and boost sharpness and color to make the final image look better than the raw data. The danger is potentially overcorrecting and making things worse. Fundamentally-minded researchers (like fans of high quality lenses) would insist that quality originate in the raw data itself. The problem is that employing a quality “computational lens” requires much more computational time and expense, and designing better “lenses” that produce better raw data is a very slow research process. It appears that the use of post-processing to correct for DFT’s shortcomings will only grow while researchers wait for more fundamental improvements to accuracy.
If you point your camera up at the night sky, open up the aperture, and take a very long exposure (or stack multiple shots) you can reveal the circular trails left by stars through the night sky as the earth rotates. These images are not accurate depictions of the sky at any one moment, but instead expose a different truth of how it rotates with time. To get these images, one must think of the camera as not just a point-and-click device but rather as a multi-purpose tool.
Creative controls also exist for the quantum calculation camera; one can artificially stretch the spacing between atoms past their normal configuration or calculate the properties of materials with magnetic configurations unknown in reality. These calculations are opportunities to predict not only what is, but what might be and see things in a different way. For example, could a battery material be charged faster if we increased the spacing between atoms? Sadly, the “point-and-compute” method does not encourage this type of creative approach; those unfamiliar with the manual controls may think of DFT in a reduced vocabulary.
Organization, sharing, and the democratization of DFT
Perhaps the most unambiguous improvement of the new generation of DFT calculation will be how data is organized and shared. In the past, the results of a DFT study were buried within a maze of custom file folder hierarchies stored on a research group computing cluster that could be accessed by only a dozen people and usually navigated by only the author of the study.
We are starting to see a shift. Researchers are developing software to manage and share calculations, and large global data repositories share tens of thousands of DFT results with thousands of users via the internet (something unimaginable to many in the field only a decade ago). The audience for a once purely theoretical method is greatly expanding.
The changes to DFT calculations are not occurring as quickly or drastically as they did for photography, but they are certainly happening. Like it or not, today’s computational materials scientists will soon be sharing the field with many “point-and-compute” enthusiasts. The old guard, then, must learn to maintain the strengths of the old ways while still taking advantage of all the new opportunities.Footnotes  For more about the Monolith photograph as well as other photographers (like Cartier-Bresson), see this video series by Riaan de Beer.
 Ansel Adams shot “The Monolith” on 6.5×8.5 film which would be roughly equivalent to 550MP according to this article. Of course, whether his lens was sharp enough to capture that level of detail is another matter.
 For example, corrections have been developed for gas to solid phase reactions, solution to solid phase reactions, metal/element to compound reactions, and localized versus delocalized compound reactions.
 It turns out that increasing atom spacing can make some electrodes charge and discharge faster.
4 thoughts on “Point-and-Compute: Quantum Mechanics in “Auto” Mode”
Thanks for the post. It’s the first time I hear about the DFT result databases, and after browsing a little bit, I find it extremely convenient and efficient.
I am thinking on the hours I spent trying to reverse engineer “computational details” sections into inputs that reproduce the actual results accurately…
Great! If you’re starting out, a few of these DFT databases include:
* Materials Project (https://www.materialsproject.org) – the one I work on
* AFLOWLib (http://aflowlib.org)
* Computational Materials Repository (https://cmr.fysik.dtu.dk/cmr/index.php)
* OQMD (http://oqmd.org)
* Electronic Structure Project (http://gurka.fysik.uu.se/ESP/)
Happy materials hunting!
Thanks for the interesting blog. I think your blog is the first of its kind in the field of high-throughput computing of materials.
Just a follow-up question of the above discussion. What are the main differences of the DFT databases listed above? It seems to me that most of them all use VASP as the ab-initio computing engine.
Thanks! In terms of methods, many of them are similar. The Electronic Structure Project uses slightly different methods than the rest (FP-LMTO/LDA), and the Computational Materials Repository uses a new GLLB-SC functional (which they claim give better band gaps) for much of its data. There are also a few differences in the way these projects employ a +U correction, and whether they ‘post-process’ results (Materials Project does this for many of its apps). However, the purpose of these databases doesn’t seem so much to validate and compare different methods as much as to provide ‘accurate’ data on many compounds.
Writing about the differences is a bit difficult, because many of these projects are in their infancy and will likely evolve substantially in the coming years. However, there are differences with regard to:
* Data coverage – OQMD has many metals/alloys and predicted compounds, Materials Project has many large oxides and ceramics and is currently mostly ICSD compounds, CMR has specialized data sets like complete coverage of perovskites
* Properties – AFLOWlib allows you to search on thermoelectric properties, Materials Project lets you filter on thermodynamic stability (e above hull), most (but not all) let you filter on band gap
* Data export – OQMD lets you download all the data, AFLOWlib is difficult to get a large amount of data from, and Materials Project provides a programmatic interface to the data.
* Additional features – Materials Project has many additional “apps” for using and exploring the data sets, e.g. Phase Diagram creation, Pourbaix app creation, or Li Battery Explorer
* Ease of use
Probably the best strategy is to explore and get results from a few of these databases and to monitor how they progress over time, since no database currently supersedes all others and many of these will overcome some of their major limitations in the near future.