Phase Diagram comic: part 5

Important note: if you are just joining us, you probably want to go back and start from the first page of the Phase Diagram comic!

With page 5, we’re nearing the end of the phase diagram adventure. This page explains open element phase diagrams in what I think is an intuitive (and perhaps new) way. Stay tuned for the next post, which will contain the full phase diagram comic (including the final page) and bring the journey to an end!


To be continued…

Further resources

“Thermal stabilities of delithiated olivine MPO4 (M = Fe, Mn) cathodes investigated using first principles calculations” by Ong et al.



If Li ion battery cathode materials (generally oxygen-containing compounds) release O2 gas from their lattice, it can lead to runaway electrolyte reactions that cause fire. Thus, a safe cathode material resists O2 release even under extreme conditions. Stated another way, safety is the “price point” (inverse O2 chemical potential) at which a cathode material will give up its oxygen. The higher the price point, the more stable the compound. This paper compares the critical chemical potential for O2 release between MnPO4 and FePO4 cathode materials, finding that similar chemistry and structure doesn’t necessarily imply similar safety.

“CO2 capture properties of M–C–O–H (M.Li, Na, K) systems: A combined density functional theory and lattice phonon dynamics study” by Duan et al.


The CO2 capture problem is to find a compound that absorbs CO2 from an open environment at chemical potentials found in industrial processes, and then releases the CO2 back into some other open environment under sequestration conditions. This paper constructs multi-dimensional phase diagrams to predict how different chemical systems will react with CO2 under different conditions.


Phase Diagram comic: part 4

Important note: if you are just joining us, you probably want to go back and start from the first page of the Phase Diagram comic!

After a hiatus, the phase diagram comic is back! Page 4 is about generating and interpreting ternary and higher phase diagrams. This will lead to the next page, which is going to try to make the extremely useful but generally confusing “grand canonical” construction a bit friendlier. For now, be sure to check out “Further Resources” below the comic for more information!

Also, thanks to Parker Sear for a comment that led directly to panel 8 (expt. vs. computational) of the comic!



To be continued…

Further resources

Wikipedia article on Ternary plots


Ternary plots are not only for phase diagrams (the most creative usage I’ve ever seen is in Scott McCloud’s Understanding Comics, where it is used to explain the language of art and comics). Wikipedia does a good job of explaining the basics of how to read and interpret compositions on ternary diagrams.

“Li-Fe-P-O2 phase diagram from first principles calculations” by Ong et al.


Here is a nice example of the computation of a quaternary phase diagram – sliced into ternary sections – from first principles calculations.

“Accuracy of density functional theory in predicting formation energies of ternary oxides from binary oxides and its implication on phase stability” by Hautier et al.


How accurate are computational phase diagrams? The correct answer, like always, is “it’s complicated”. But based on results from this paper and some experience, colleagues of mine and I have found that an error bar of 25 meV/atom is usually a good estimate. We usually double that to 50 meV/atom when searching for materials to synthesize by conventional methods.

“Formation enthalpies by mixing GGA and GGA + U calculations” by Jain et al.


In an ideal world, first principles calculations would live up to their name and require no adjustable parameters. In practice, however, DFT errors do not always cancel when comparing energies of compounds with different types of electronic states. This paper shows how one can mix two DFT approximations along with some experimental data in order to produce a correct phase diagram across a changing landscape of electronic states.

“First-Principles Determination of Multicomponent Hydride Phase Diagrams: Application to the Li-Mg-N-H System” by Akbarzadeh et al.


An alternate (but equivalent) approach to the convex hull algorithm for determining phase diagrams is to use a linear programming approach. This is demonstrated by Akbarzadeh et al. in the search for H2 sorbents.


Phase Diagram comic: part 3

Important note: if you are just joining us, you probably want to go back and start from the first page of the Phase Diagram comic!

After a couple of pages of preliminaries, we are finally ready to discuss some materials science! Whereas page 2 employed convex hulls to maximize profits in a hypothetical bike shop, page 3 applies the same formalism to determine thermodynamic stability and demonstrate how chemical compounds can decompose to minimize energy.

And the adventure is far from over! In fact, next time we’ll add a whole new dimension to our analysis! For now, be sure to check out “Further Resources” below the comic for more information.


To be continued…

Further resources:

“Accuracy of ab initio methods in predicting the crystal structures of metals: A review of 80 binary alloys” by Curtarolo et al.


This (somewhat epic!) paper contains data for 80 binary convex hulls computed with density functional theory. The results are compared with known experimental data and it is determined that the degree of agreement between computational and experimental methods is between 90-97%.

“A Computational Investigation of Li9M3(P2O7)3(PO4)2 (M = V, Mo) as Cathodes for Li Ion Batteries” by Jain et al.


The endpoints of a binary convex hull need not be elements. For example, in the Li ion battery field one often searches for stable intermediate phases that form at certain compositions as lithium is inserted into a framework structure. The paper above is just one example of many computational Li ion battery papers that use such “pseudo-binary” convex hulls.

“Configurational Electronic Entropy and the Phase Diagram of Mixed-Valence Oxides: The Case of LixFePO4” by Zhou et al.


Incorporating temperature into first-principles convex hulls is often possible, but not always straightforward or easy to do. Here is one example of this process that focuses on electronic entropy.

Materials Project Phase Diagram App

The Materials Project phase diagram app allows one to construct interactive binary convex hulls for any system by employing computed data on tens of thousands of compounds. You can also create phase stability diagrams for ternaries and higher!



Phase Diagram comic: part 2

Important note: if you are just joining us, you probably want to go back and start from the first page of the Phase Diagram comic!

Page 2 of the phase diagram comic doesn’t accomplish much on its own, but sets the stage for the upcoming pages. Those of you familiar with computational phase diagrams might already see where this analogy is headed, although there’s still a lot of ground to cover and maybe even a few surprises (unless you are really far ahead), so stay tuned.

And here’s page 2!


To be continued next time, when our protagonist wanders into an experimental synthesis lab…



Phase Diagram comic: part 1

One of the more powerful tools in materials screening is the computational phase stability diagram. Unfortunately, it is only utilized by a few research groups, and I thought that a comic book about them might improve the situation.

The project is taking a bit longer than expected…so I am releasing the comic page-by-page, with the first page below.  This page is just an introduction; there will be about 5 more pages that fill in more details, with one post per page. The final post will provide the full comic as a PDF and will include an article with more details.

So without further ado – page 1!


To be continued…

Parallelizing people

Developing parallel computer algorithms is becoming more important as CPU architectures jettison clock speed in favor of multicore designs.[1] Indeed, many scientific applications are moving beyond CPUs to graphical processing units (GPUs), which are composed of thousands of individual cores that can complete certain tasks orders of magnitude faster than CPUs. The concept of GPU computing was visually demonstrated by the Mythbusters, who unveiled a massively parallel paintball bazooka (with 1100 individual barrels standing in for a GPU’s many computing cores) that physically produced a painting of the Mona Lisa in 80 milliseconds.[2]

What about human parallelization? In software development, it’s certainly possible to achieve spectacular gains in productivity by progressing from a single coffee-addled, “overclocked” hacker to a large and distributed team. For example, few would have predicted that a complex, tightly integrated, and extremely stable operating system could be produced by a team of volunteers working in their spare time.[3] Yet, Linux is just that and somehow manages to be competitive with billion dollar efforts in the commercial sector![4]

How can open source and parallel software development work so well? Karl Fogel offers one clue in his book, Producing Open Source Software:[5] namely, that debugging works remarkably well in parallel. Not only do more contributors add more “eyes” on the code, but large and diverse teams are more likely to contain someone that has the precise background needed to identify a subtle bug. Such an effect can be seen outside of the software world as well; for example, the internet community recently decoded a cryptic, decades-old letter from a mother to her grown children in very little time.[6] The key to success? Some of the amateur sleuths were intimately familiar with common Christian prayers that were the focus of her letter. I’ve had similar luck when posting a tricky probability question to the Math Exchange web site; the problem was non-trivial to two esteemed mathematician colleagues but was expertly answered several times over (including formulas!) in under an hour by internet citizens more familiar with that type of problem. The first correct response arrived within 15 minutes.

Unfortunately, examples of such community-driven development are rare in the field of materials science (especially in the United States). For example, none of the current electronic structure databases fit the bill. Perhaps this is because such efforts are still young and building momentum. However, it is also possible that scientists underestimate the difficulty of building software with a healthy community of developers. Adding to the difficulty are challenges particular to the scientific realm.

The following are some early experiences from building software for the Materials Project.

Work with computer scientists (but don’t expect them to solve all your computer science problems)

Software development is a fast-moving field, and computer scientists can provide crucial guidance on modern software technologies and development practices. Yet, at the same time, the bulk of actual programming tasks will most likely involve gluing together software design principles with materials science applications. The most effective “bonders” are those who hybridize themselves between materials science and computer science. Trying to divide programming work into materials science problems and computer science problems leads to weak bonds that are more susceptible to communication overhead and more likely to impede progress. Therefore, the Materials Project generally hires research postdocs that are also competent programmers. One strategy that is surprisingly effective in this effort is to employ a basic programming assignment to assess core competency and motivation before the phone interview stage.[7]

Structure code for compartmentalized development (then work a lot to help people anyway)

Getting the community to adopt your project is quite difficult (my own software project, FireWorks, has certainly not gotten there yet!). One thing that is for certain is that functional, useful code combined with an open license doesn’t equal a community driven project. Contrary to potential fears of code theft or receiving harsh criticism upon going open-source (or dreams of immediate fame and thank-you letters), the most likely scenario of sharing code is that the world will not give much notice.

The usual tips involve writing code that at least has the potential for distributed development, e.g., by writing modular code and spending time on documentation.[5] But following the advice is more difficult when working with scientific collaborators. For example, modular software that computer scientists can read easily is often impenetrable to materials scientists, simply because the latter are often not familiar with programming abstractions such as object-oriented programming that are meant to facilitate scalability and productivity. And because a codebase itself is generally not seen as the final output (scientists are hired and promoted based on scientific results and journal articles, not Github contributions), scientists can be poorly motivated to work on documentation, code cleanup, or unit tests that could serve as a force multiplier for collaboration.

A public codebase is like a community mural that must recruit diverse volunteers and be able to extend and fix itself.
A public codebase is like a community mural that must recruit diverse volunteers and be able to extend and correct itself.

Unfortunately, perhaps the only way to write good code is to first write lots of bad code. This can lead to tension because senior developers can act like already-industrialized nations that expect newcomers to never “pollute” the codebase whilst themselves being guilty of producing poor code during their development.

One strategy to address mixed levels of programming proficiency is to entrust the more technical programmers with the overall code design and core library elements and to train newcomers by having them implement specific and limited components.[8] Surprisingly, the situation can end up not too different from that described in Wikipedia’s article on early Netherlandish painting:[9]

“…the master was responsible for the overall design of the painting, and typically painted the focal portions, such as the faces, hands and the embroidered parts of the figure’s clothing. The more prosaic elements would be left to assistants; in many works it is possible to discern abrupt shifts in style, with the relatively weak Deesis passage in van Eyck’s Crucifixion and Last Judgement diptych being a better-known example.”

If an in-house software collaboration is like a small artist studio, a public software project might be more like a large, ever-expanding community mural that must recruit and retain random volunteers of various skill levels. Somehow, the project must reach a point where the mural can largely extend itself in unexpected and powerful ways while still maintaining a consistent and uniform artistic vision. In particular, the project is truly healthy only when (a) it can quickly integrate new contributions and fixes to older sections and (b) one is confident that the painting would go on even if the lead artist departs.[5] These added considerations involve many human factors and can often be more difficult to achieve than producing good code.

We are often our own worst enemy

As scientists, we are often our own worst enemy in scaling software projects. Whereas computer scientists generally take pride in their “laziness” and happily reuse or adopt existing code (probably learning something in the process), regular scientists are by nature xenophobic and desire to write their own version of codes from scratch using programming techniques they are comfortable with. This often leads to myopic software and stagnation in programming paradigms. In particular, the programming model of “single-use script employs custom parser to read haphazard input file to produce custom output file” is severely outdated but extremely common in scientific codes.

Scientists can also be very protective of code, too heavily weighting the potential negative aspects of being open (e.g., unhelpful criticism, non-attribution) and inadequately weighting its benefits (bug fixes, enhancements, impact). In particular, even some supposedly “open” scientific software require explicitly requesting the code by personal email or come saddled with non-compete agreements. It is unclear what fear motivates the authors to put up barricades to programs that they ostensibly want to share. But it is doubtful that Linux – despite its brilliant kernel – would have ever seen such success if all users and collaborators were required to first write Linus Torvalds an email and agree to never work on a competing operating system.

As the years pass, what might distinguish one electronic structure database effort from another might not be the number of compounds or its initial software infrastructure but rather how successfully it leverages the community to scale beyond itself. It will most likely be a difficult exercise in human parallelization, but it can’t be more complicated than writing an operating system with a bunch of strangers – right?

[1] Some of the issues in parallel programming are summarized here.
[2] Here’s that Mythbusters video.
[3] The Cathedral and the Bazaar by Eric S. Raymond
[4] One (crude) estimate puts the cost of Windows Vista development at 10 billion dollars. The budget for Windows 8 advertising alone is estimated to be over a billion dollars.
[5] Producing Open Source Software by Karl Fogel.
[6] The internet quickly decodes a decades-long family mystery.
[7] The usefulness of the programming challenge and other tips for hiring programmers are explained by Coding Horror.
[8] Another strategy is to employ multiple codebases that are cleanly separated in functionality but integrate and stack in a modular way. For example, the Materials Project completely separates the development of its workflow code from its materials science code. Such “separation of powers” can also accommodate different personalities by giving different members full ownership of one code, affording them the authority to resolve small and counterproductive arguments quickly (à la the benevolent dictator model of software management).
[9] Wikipedia article on Early Netherlandish painting. Incidentally, while Wikipedia is often criticized for being unreliable due to its crowdsourced nature, a Nature study found that the online material of Britannica was itself guilty of 2.92 mistakes per science article; Wikipedia was not so much worse with 3.86 mistakes per article. Another interpretation is that both these numbers are way too large!

Will computers imagine the materials of the future?

As a species, we are particularly proud of our capacity for creative thought. Our ability to invent tools and imagine abstract concepts distinguishes us from other animals and from modern day computers (for example, the Turing Test is essentially a clever form of a creativity test).

However, in many cases, humans can now program computers to solve problems better than they can. In 1987, chess grandmaster Garry Kasparov’s proclaimed that “No computer can ever beat me”. The statement was perfectly reasonable at the time, but it would be proven wrong within a decade. Similar turning points in history won’t be limited to games.[1] For a 2006 NASA mission that required an unconventional antenna design, the best-performing, somewhat alien-looking solution was not imagined by a human antenna design expert (of which NASA surely had plenty) but was instead evolved within a computer program. When mission parameters changed, a new design was re-evolved for the new specifications much more rapidly than could be achieved by NASA staff.[2]

It is difficult to say whether current achievements by computers truly constitute “creativity” or “imagination”. However, the philosophical ambiguity has not stopped materials researchers from joining in on the action. As one example, Professor Artem Oganov’s lab at Stony Brook University has used computer programs to evolve new materials that have been observed at high pressures and might play a role in planet formation. These materials can be quite unexpected, such as a new high-pressure form of sodium that is transparent and insulating rather than silver and metallic. Thus, while we may not know whether to label the computer’s process as a “creative” one, the end result can certainly possess the hallmarks of purposeful design.

Indeed, if there is any doubt that computer algorithms are capable of producing creative solutions, one needs only to visit the Boxcar2D site.[3] This website uses evolutionary algorithms to design a two-dimensional car from scratch; the process unfolds before your eyes, in real time.

Cars designed by the Boxcar2D algorithm at various generations of the evolutionary algorithm. Seemingly arbitrary combinations of wheels and chassis at generation zero gradually and automatically become ordered and logical arrangements.

It is instructive to observe the Boxcar2D algorithm “thinking”. Towards the beginning of the simulation, most designs are underwhelming, but a few work better than you’d intuit. For example, my simulation included a one-wheeler that employed an awkward chassis protrusion as a brake that modulated speed during downhill sections. It was a subtle strategy that outperformed a more classic motorbike design that wiped out on a mild slope.

Eventually, the one-wheelers would prove too cautious, and the algorithm began designing faster two-wheelers that better matched the size between the wheels and the proportions of the frame to prevent flipping. Finally, the algorithm designed a car that was symmetric to being turned upside down, eliminating the problem altogether. All this happened in the course of minutes in a somewhat hypnotic visual progression from random to ordered.

However, despite the successes in computational optimization, several obstacles still exist that prevent future materials from being designed by computers. The biggest problem is that computers cannot yet predict all the important properties of a material accurately or automatically. In many cases, computational theory can only predict a few heuristic indicators of less than half the important technological properties of a material. Thus, materials that look promising to the computer are incomplete models that require further evaluation and perhaps re-ranking by human experts. Indeed, the most successful automatic algorithms have been those used for crystal structure prediction (such as the research of Professor Oganov), for which simple computer calculations are very good at ranking different compounds without adult supervision.

Will we train computers to imagine materials for us? [Crystal structure based on a CuI structure predicted by the USPEX code].
There are other problems; for example, genetic algorithms typically require many more calculations to find a good solution compared with human-generated guesses. However, this “problem” may also be an advantage. The computer’s willingness to produce several rounds of very poor-performing, uninhibited designs frees it from the bias and conservatism that can be displayed by human researchers, thereby revealing better solutions in the long run.[4] Still, helping the computer become a smarter and more informed guesser would certainly improve the prospects for designing materials in computers.

Today, almost all materials are still devised by humans within a feedback loop of hypothesis and experiment. The next step might be to mix human and machine – that is, to use human intuition to suggest compounds that are further refined by a computer. Yet, perhaps one day, materials design may become more like chess or antenna design. Like parents of a gifted child, it might become more logical for materials scientists to train their computers to be more imaginative than them.

[1] A nice illustrated history of artificial intelligence in games was presented by XKCD.
[2] More about NASA’s evolved antenna design here.
[3] Attempt to evolve cars from randomness at the Boxcar2D site.
[4] For example, my colleagues (with minor help from myself) devised data-driven algorithms for materials prediction that more closely mimic a first step undertaken by many researchers (links here and here). These algorithms are more efficient at finding new materials but are much less “creative” than the evolutionary algorithms employed by Professor Oganov and others.

applying computing to materials design