# How to use *sqsgen*? This is short tutorial, on how *sqsgenerator* works ## Using the CLI interface This section deals with the usage of the *sqsgenerator* package. A more granular documentation for the CLI can be found in the {ref}`CLI Reference `. The CLI interface was built using the excellent [click](https://click.palletsprojects.com/en/8.0.x/) framework. Once you have managed to {ref}`install ` *sqsgenerator* you should have a command `sqsgen` available in your shell. Make sure you can call the `sqsgen` command before you start, using ```{code-block} bash sqsgen --version ``` which should plot version information about *sqsgenerator* and also its dependencies. ### The `sqs.yaml` file *sqsgenerator* uses a `dict`-like configuration, to store the parameters used for the iteration/analysis process. By default, the program assumes the configuration to be stored in a YAML file. YAML is easy to read and write by humans. On this tutorial site we will focus on setting up proper `sqs.yaml` input files. Most of the CLI commands which require an input settings file. However, specifying a file is always optional, since *sqsgenerator* will always look for a default file name `sqs.yaml` in the **current directory**. *sqsgenerator* can also read more formats, which can store Python `dict`s, such as **JSON** and **pickle**. Therefore, all commands which require a settings file also do have an `--input-fomat` (`-if`) option, which instruct the program to use different file-formats. For more infos please have look at the {ref}`CLI Reference `. ````{admonition} *sqsgenerator* searches for a *sqs.yaml* file :class: note, dropdown Most of `sqsgen`'s subcommands need such an input file to obtain the parameters. By default the program searches for a file named `sqs.yaml` in the directory where it is executed You can always specify a different one in case you want the program to execute a different one. ```{code-block} bash sqsgen run iteration # is equal to sqsgen run iteration sqs.yaml sqsgen run iteration path/to/my/custom/sqs_config.yaml # or pass a custom file ``` This applies to nearly all commands, except `sqsgen analyse` which by default expects {ref}`sqs.result.yaml ` ```` ### Simple SQS #### Simple SQS - an ideal $\text{Re}_{0.5}\text{W}_{0.5}$ solution In the following example we use a **Monte-Carlo** approach using by probing **one billion** different configurations. Only **the first** coordination shell should be taken into account. We create super-cell with 54 atoms, by replicating a simple B2 structure ```{code-block} yaml --- lineno-start: 1 caption: | Download the input {download}`YAML file ` --- structure: lattice: - [3.165, 0.0, 0.0] - [0.0, 3.165, 0.0] - [0.0, 0.0, 3.165] coords: - [0.0, 0.0, 0.0] - [0.5, 0.5, 0.5] species: [W, Re] supercell: [3, 3, 3] iterations: 1e9 shell_weights: 1: 1.0 ``` So let's go together through this configuration: - **Line 2:** create a cubic lattice with a lattice parameters of $a_{bcc} = 3.165\;\text{A}$ - **Lines 7-8:** Place two lattice sites at positions $(0,0,0)$ and $(\tfrac{1}{2}, \tfrac{1}{2}, \tfrac{1}{2})$ - **Line 9:** Occupy the first site with Tungsten and the second one with Rhenium - **Line 10:** Replicate this unit cell three times into $a$, $b$ and $c$ direction - **Line 11:** Test $10^9$ different configurations - **Lines 12-13:** Use only the first coordination shell with a shell weight of $w_i=1.0$ {eq}`eqn:objective`. We have to explicitly state that. By default, SRO parameters in **all available** coordination shells {eq}`eqn:wc-sro-multi` are minimized at the same time In the unit cell we do have a 50-50 composition. Replication does not change to chemistry, thus we end up with 27 tungsten atoms and 27 rhenium atoms in the final configuration. ````{admonition} Check which species are actually distributed :class: tip, dropdown You can always check, which atomic species are distributed on the lattice sites using the `params show` command ```{code-block} bash sqsgen params show --parameter composition ``` ```` #### Running an optimization Once you have created a YAML input file you can run an optimization In case you have downloaded the above example you can run it using ```{code-block} bash sqsgen run iteration re-w.first.yaml ``` In case you have not passed a custom script the program will create an output file named `sqs.result.yaml`. Otherwise, it will modify the passed filename e. g. `re-w.first.yaml` $\rightarrow$ `re-w.first.result.yaml` #### The `sqs.result.yaml` file (sqs-result-yaml)= The `*.result.yaml` files are used to dump the output of the optimization process. The file contains the following entries: - `structure`: the structure read from the input file in expanded format - `which`: the list of selected lattice positions - `timings`: runtime information, saved in `dict` like format. The keys are integer numbers and identify the MPI rank. In case you **do not have** MPI enabled version it contains always only one entry with key `0`. The numbers represent the average time a thread needs to analyse a structure and generate the next one. The times are in **µs**, while the index in the value list corresponds to the thread ID. - `configurations`: the computed SQS results in a dict-like manner. The keys are the [rank](https://stackoverflow.com/questions/22642151/finding-the-ranking-of-a-word-permutations-with-duplicate-letters) of the permutation sequence. The values are a sequence of atomic symbols. ##### How many structures are actually computed? The number of structures in `sqs.result.yaml` is basically determined by the {ref}`max_output_configurations ` parameters which is by default 10. There is however as post-processing step **after the minimization** process. The default behaviour *sqsgenerator* is to discard those configurations which do not exhibit the minimal values of the objective function. Furthermore, our definition of the objective function in Eq. {eq}`eqn:objective` may yield "*degenerate*" results, which are also discarded in the post-processing step. This "*degeneracy*" decreases by including more coordination shells. - To include degenerate structures you can use the ` --no-similar`/`-ns` switch - To include structures eventually with non-optimal objective function use `--no-minimal`/`-nm` switch ````{admonition} Include objective function $\mathcal{O}(\sigma)$ and SRO parameters $\alpha^i_{\xi\eta}$ :class: tip, dropdown To dump the also the objective function $\mathcal{O}(\sigma)$ {eq}`eqn:objective` and SRO parameters $\alpha^i_{\xi\eta}$ {eq}`eqn:wc-sro-multi` you can use the `--dump-include`/`-di` switches. You need to explicitly specify each quantity. ```{code-block} bash sqsgen run iteration --dump-include parameters --dump-include objective ``` will include both quantities in `sqs.result.yaml`. This will however modify the structure of the `configuration` key in `sqs.result.yaml`. ```` #### Export the computed structures To obtain the structures stored in `sqs.result.yaml` the `export` command should be used. This command searches for a **sqs.result.yaml** if not specified. ```{code-block} bash sqsgen export re-w.first.result.yaml ``` will export all the structures in **cif** format. - The filename will be the rank of the permutation. - You can specify a different output format using `--format`/`-f` switch. - You can explicitly specify the backend with the `--writer`/`-w` switch. If not specified otherwise the **ase** backend will be used - To gather the structure files in an archive use the `--compress`/`-c` switch ````{admonition} Directly export the structure :class: tip, dropdown The `export` command is optional. Structures can be exported directly using the `run iteration` command if a `--export`/`-e` switch is passed. If the `--export` switch is passed, the above mentioned switches (`--format`/`-f`, `--writer`/`-w` and `--compress`/`-c`) can be using in combination with the `run iteration` command ```{code-block} bash sqsgen run iteration -e -f poscar -c xz -w pymatgen ``` will run an iteration and export all computed structures in **POSCAR** format using **pymatgen** in an archive **sqs.results.tar.xz** ```` #### Specifying you own compositions - $\text{Re}_{0.333}\text{W}_{0.667}$ (example-two)= Suppose we want to move on different compositions, and want to distribute different numbers of tungsten and rhenium. In this case we have to explicitly specify a {ref}`composition ` parameter. Using this directive we can exactly specify **which** and **how many** atoms should be distributed. We will slightly modify the example from above. ```{warning} Package dependencies In order to run this example you need to have either `ase` or `pymatgen` installed. See {ref}`optional dependencies ` for more information ``` ```{code-block} yaml --- lineno-start: 1 caption: | Download the {download}`YAML file ` and the {download}`B2 structure file ` --- structure: file: b2.vasp supercell: [3, 3, 3] iterations: 1e9 shell_weights: 1: 1.0 composition: Re: 18 W: 36 ``` Again let's analyse the difference in the input file and what it is actually doing under the hood - **Line 2:** read the file **b2.vasp** from the disk. By default `ase` will be used to read the structure file. For more information see the {ref}`structure ` parameter documentation - **Lines 7-9** distribute 18 Rhenium and 36 Tungsten atoms on the lattice positions. 1. the B2 structure file contains 2 lattice position 2. in **Line 3** we replicate it three times in all directions 3. one needs to distribute $2 \times 3 \times 3 \times 3 = 54$ atoms on the lattice positions 4. the number of distributed atoms **must match** the number lattice positions to occupy The using `composition` parameter you can distribute any arbitrary sequence of atomic elements. Suppose we want to create cells with an even more complicated composition e. g. $\text{Re}_{12}\text{W}_{14}\text{Mo}_{14}\text{Ta}_{14}$ simple change `composition` section in the above example to: ```{code-block} yaml --- lineno-start: 7 caption: | B2 structure with $\text{Re}_{12}\text{W}_{14}\text{Mo}_{14}\text{Ta}_{14}$ stochiometry --- composition: Re: 12 W: 14 Mo: 14 Ta: 14 ``` ### Perform SQS only on selected sites #### Perform SQS on a sublattice only - $\text{Ti}\text{N} \rightarrow \text{Ti}_{0.5}(\text{B}_{0.25}\text{N}_{0.25})$ *sqsgenerator* allows you to select lattice positions, on which the SQS iteration is then carried out. This is done by specifying a {ref}`which ` input parameter. All sites which are not explicitly chosen are ignored during the optimization. The following example checks **all possible** configuration and will therefore an optimized SQS structure ```{code-block} yaml --- lineno-start: 1 emphasize-lines: 4, 7 caption: | Download the {download}`YAML file ` and the {download}`TiN structure file ` --- structure: supercell: [2, 2, 2] file: ti-n.cif mode: systematic shell_weights: 1: 1.0 which: N composition: B: 16 N: 16 ``` - **Line 4:** set the iteration mode to **systematic**. This will scan through all possible structures. **Note:** Check the size of the configuration space before actually running the minimization process. Otherwise, the program might run "*forever*" - **Line 7:** use only nitrogen lattice positions to perform the SQS minimization. This example generates all possible configurations ($\approx 6 \cdot 10^8$) and analyses them. It is recommended to use `compute estimated-time` when using **systematic** iteration mode. ```{code-block} bash > sqsgen compute total-permutations ti-n.yaml # check the size of the configurational space 601080390 > sqsgen compute estimated-time ti-n.yaml # estimate how long it will take It will take me roughly 14 minutes and 23.576 seconds to compute 601080390 iterations (on 8 threads) > sqsgen run iteration ti-n.yaml ``` #### $\gamma$-iron (austenite) - Partial random occupancy of interstitial atoms The *sqsgenerator* also knows a fictitious atomic species "**0**", representing a vacancy. During the optimization vacancies will be treated as atoms. When exporting the structures the vacancies are deleted. The following example constructs a $\gamma$-iron cell, where carbon is distributed on the **octahedral interstitial** sites. Therefore, the {download}`structure file ` contains four iron atoms and four hydrogen (H) atoms on the octahedral sites. ```{code-block} yaml --- lineno-start: 1 emphasize-lines: 7, 10 caption: | Download the {download}`YAML file ` and the {download}`iron structure file ` --- structure: supercell: [3, 3, 3] file: gamma-iron-octahedral.vasp iterations: 1e8 shell_weights: 1: 1.0 which: H composition: C: 9 0: 99 ``` - **Line 7:** hydrogen works here as a **dummy** species. We select those interstitial sites - **Line 10:** distribute nine carbon atoms and 99 vacancies ### Analyse existing structures Sometimes it is desirable to compute the SRO parameters ($\alpha^i_{\xi\eta}$) for an exiting arrangement of atoms rather than to generate a new one. To analyse existing structures *sqsgenerator* provides you with the "*analyse*" command. #### Restore $\alpha^i_{\xi\eta}$ from structure files **Note:** This example only works with `pymatgen` or `ase` installed 1. We use the {ref}`example above ` to generate some randomized structures by executing ```{code-block} bash sqsgen run iteration --similar --no-minimal --export --dump-include objective --dump-include parameters re-w.second.yaml # or with shortcuts sqsgen run iteration -s -nm -e -di objective -di parameters re-w.second.yaml ``` 2. The above command will store the optimized configurations in a file named *re-w.second.result.yaml*. The file will, in addition also contain (eventually) configurations which do not minimize (`--no-minimal/-nm`) the objective function Eq. {eq}`eqn:objective-actual`. Furthermore, it will not check for duplicates in the SRO formalism (`--similar/-s`). Finally *re-w.second.result.yaml* will contain the SRO parameters $\alpha^i_{\xi\eta}$ (`--dump-include/-di parameters`) as well as the value of the objective function $\mathcal{O}$ (`--dump-include/-di objective`). All the configurations will be also exported into *CIF* format (default). Listing your directory, should give you ten additional cif-files. 3. Please inspect *re-w.second.result.yaml* with a text editor 4. Now, the task is to reconstruct the SRO parameters from the exported cif-files. Therefore use: ```{code-block} bash sqsgen analyse *.cif ``` The command will print the computed SRO parameters, nicely formatted to the console. **Note:** The output will show you SRO parameters for *seven* coordination shells with the default {ref}`shell_weights ` of $w^i = \frac{1}{i}$. This happens since *sqsgenerator* does not know the settings for computing the structures, hence it uses its default values. 5. To fix this, the `analyse` command takes a `--settings/-s` parameter. It points to a file providing the input settings. In this particular example we have two ways forward, to obtain the same values as in *re-w.second.result.yaml*: - create a new file *settings.yaml* with the following lines ```{code-block} yaml shell_weights: 1: 1.0 ``` to take into account only the first coordination shell as {ref}`above ` and run ```{code-block} bash sqsgen analyse *.cif --settings settings.yaml ``` - reuse the input file *re-w.second.yaml* and just execute ```{code-block} bash sqsgen analyse *.cif --settings re-w.second.yaml ``` *sqsgenerator* will ignore all parameters which are not needed. #### Counting pairs in coordination shells using the `analyse` command *sqsgenerator* can also compute the number of bonds in existing structures, by tweaking parameters for the `analyse` command properly. A closer look on Eq. {eq}`eqn:sro-modified` reveals, by setting the {ref}`prefactors ` $f^i_{\xi\eta} = 1$ the SRO parameters become $\alpha^i_{\xi\eta} = 1 - N^i_{\xi\eta}$. Hence by modifying *settings.yaml* file to ```{code-block} yaml --- lineno-start: 1 emphasize-lines: 4-5 --- shell_weights: 1: 1.0 2: 0.5 prefactor_mode: set prefactors: 1 ``` - **Line 4:** explicitly overrides the values of $f^{i}_{\xi\eta}$ with those provided in the file - **Line 5:** set $f^i_{\xi\eta}$ to 1 To obtain the number of $\xi - \eta$ pair we have to compute $N^i_{\xi\eta} = 1 - \alpha^i_{\xi}$ *sqsgenerator* support also other output formats than printing it to the console. Hence, we want to illustrate how *sqsgenerator*'s CLI can be used directly in combination with Python without using the Python API ```{code-block} python import os import yaml import pprint import numpy as np # analyse the structure and export results in YAML format(--output-format/-of yaml) yaml_output = os.popen('sqsgen analyse *.cif -s settings.yaml -of yaml') results = yaml.safe_load(yaml_output) # loop over output results for analysed_file, configurations in results.items(): for rank, results in configurations.items(): # actually compute N = 1.0 - alpha results['bonds'] = 1.0 - np.array(results.get('parameters')) pprint.pprint(results) ``` ## Using Python API Of course, you can also directly use sqsgenerator directly from your Python interpreter. The package is designed in such a way that all public function are gathered int the `sqsgenerator.public` module. Those which are needed to generate and analyze structure are forwarded to the *sqsgenerator* module itself and can be imported from there Basically the API is build around two functions - {py:func}`sqsgenerator.public.sqs_optimize` - to perform SQS optimizations - {py:func}`sqsgenerator.public.sqs_analyse` - to compute objective function and SRO parameters Both functions take a `dict` as their main input. The YAML inputs above are just a file-based representation of those settings. ### Introduction To read a settings file and obtain a `dict`-like configuration use the {py:func}`sqsgenerator.public.read_settings_file` function. The examples shown above, can be easily executed in the following way using a Python script: ```{code-block} python # we use the first example shown in the CLI - How to -> re-w.first.yaml from sqsgenerator import read_settings_file, sqs_optimize configuration = read_settings_file('re-w.first.yaml') results, timings = sqs_optimize(configuration) ``` (optimization-output)= {py:func}`sqsgenerator.public.sqs_optimize` outputs a tuple of **two** values. Where the first one are the actual results, and the latter one are runtime information - `results` will contain a dictionary with integer keys. The integer key is the index of the permutation sequence. As this key is in decimal representation it might be a very long one. The value behind each key is a `dict` again, containing the following keys - `configuration`: a list of strings - `objective`: the value of the objective function - `parameters`: the SRO parameters as numpy array ### Again - $\text{Re}_{0.333}\text{W}_{0.667}$ - but from scratch We now want to show how the {ref}`second example ` would look, like if it was built with Python functions ```{code-block} from sqsgenerator import sqs_optimize configuration = dict( structure=dict(file='b2.vasp', supercell=(3,3,3)), iterations=1e9, shell_weights={1: 1.0}, composition=dict(Re=18, W=36) ) results, timings = sqs_optimize(configuration) ``` ### Exporting the generated structures #### Construct the generated structures By default, {py:func}`sqsgenerator.public.sqs_optimize` does not construct the {py:class}`Structure` objects from the generated configurations. You have to explicitly tell it using the `make_structures` keyword Therefore, the last line in the previous example becomes ```{code-block} python results, timings = sqs_optimize(configuration, make_structures=True) ``` This switch only affects post-processing, and adds a `structure` key to the `results` dictionary, which then becomes ```{code-block} python { 24002613167337: { 'configuration': ['W', 'W', 'W', 'Re', 'W', 'Re', 'Re', 'W', 'W', 'W', 'Re', 'W', 'Re', 'Re', 'W', 'W', 'Re', 'Re', 'W', 'Re', 'Re', 'W', 'W', 'W', 'W', 'W', 'W', 'W', 'Re', 'W', 'W', 'W', 'W', 'Re', 'W', 'W', 'Re', 'W', 'W', 'Re', 'W', 'Re', 'W', 'Re', 'W', 'Re', 'Re', 'W', 'W', 'W', 'W', 'W', 'W', 'W'], 'objective': 5.551115123125783e-17, 'parameters': array([[[5.00000000e-01, 5.55111512e-17], [5.55111512e-17, 5.00000000e-01]]]), 'structure': Structure(W3ReWRe2W3ReWRe2W2Re2WRe2W7ReW4ReW2ReW2ReWReWReWRe2W7, len=54) } } ``` ````{admonition} Specify the type of the output structure using *structure_format* keyword :class: tip, dropdown In case you have set the `make_structures` keyword to `True` you additionally can specify the output type of the structure objects using the `structure_format` keyword. Of course to use this you need [ase](https://wiki.fysik.dtu.dk/ase/) and/or [pymatgen](https://pymatgen.org/) installed to use this features. You can pass three different values to `structure_format`: - **default** $\rightarrow$ {py:class}`sqsgenerator.public.Structure` - **ase** $\rightarrow$ {py:class}`ase.atoms.Atoms` - **pymatgen** $\rightarrow$ {py:class}`pymatgen.core.Structure` ```` #### Writing generated structures to file In order to export the generated structures to files and/or archives using the {py:func}`sqsgenerator.public.export_structures` you need to set `make_structures=True` to advise the program to construct the structure. Moreover, `structure_format` must be set to `default` (which is anyway the default value). Exporting the generated structures might look like that ```{code-block} python from operator import itemgetter from sqsgenerator import sqs_optimize, export_structures, read_settings_file configuration = read_settings_file('sqs.yaml') results, timings = sqs_optimize(configuration, make_structures=True) export_structures(results, functor=itemgetter('structure')) ``` ### Computing the SRO parameters $\alpha_{\xi\eta}^i$ and objective function $\mathcal{O}(\sigma)$ of existing structures It is also possible to compute the SRO parameters of existing structure. Thus, the API exports the {py:func}`sqsgenerator.public.sqs_analyse`, which computes those quantities. {py:func}`sqsgenerator.public.sqs_analyse` takes a dict-like configuration as well as an iterable of structures, which will be analysed. The output-format is exactly the same as for {py:func}`sqsgenerator.public.sqs_optimize` (see {ref}`above `) ```{code-block} python import numpy.testing from operator import itemgetter from sqsgenerator import sqs_optimize, read_settings_file, sqs_analyse configuration = read_settings_file('sqs.yaml') results, timings = sqs_optimize(configuration, make_structures=True, minimal=False, similar=True) # same as --no-minimal --similar structures = map(itemgetter('structure'), results.values()) # for this we need make_structures=True analysed = sqs_analyse(structures, settings=configuration, append_structures=True) for rank in results: # we check that we obatin the same results with sqs_analyse assert rank in analysed assert results[rank]['objective'] == analysed[rank]['objective'] assert results[rank]['structure'] == analysed[rank]['structure'] assert results[rank]['configuration'] == analysed[rank]['configuration'] numpy.testing.assert_array_almost_equal(results[rank]['parameters'], analysed[rank]['parameters']) ``` ### Other (maybe) useful examples #### Conversion between structure types *sqsgenerator*'s API export function to convert internal {py:class}`sqsgenerator.public.Structure` objects to types employed by larger projects ([ase](https://wiki.fysik.dtu.dk/ase/) and [pymatgen](https://pymatgen.org/)) ```{admonition} Packages must be available :class: warning In order to convert structure objects back and fourth you need to have this packages installed otherwise sqsgenerator will raise a `FeatureError` ``` The compatibility functions are: - **pymatgen**: - {py:func}`sqsgenerator.public.to_pymatgen_structure` - {py:func}`sqsgenerator.public.from_pymatgen_structure` - **ase**: - {py:func}`sqsgenerator.public.to_ase_atoms` - {py:func}`sqsgenerator.public.from_ase_atoms` ```{code-block} python import numpy as np import ase.atoms import pymatgen.core from sqsgenerator import to_pymatgen_structure, from_pymatgen_structure, to_ase_atoms, from_ase_atoms, Structure fcc_al = Structure(4.05*np.eye(3), np.array([[0.0, 0.0, 0.0], [0.5, 0.5, 0.0], [0.0, 0.5, 0.5], [0.5, 0.0, 0.5]]), ['Al']*4) fcc_al_ase = to_ase_atoms(fcc_al) fcc_al_pymatgen = to_pymatgen_structure(fcc_al) assert isinstance(fcc_al_ase, ase.atoms.Atoms) assert isinstance(fcc_al_pymatgen, pymatgen.core.Structure) assert fcc_al == from_ase_atoms(fcc_al_ase) assert fcc_al == from_pymatgen_structure(fcc_al_pymatgen) ``` ## Graceful exits As the SQS optimization may require a large number of iterations, it is sometimes desirable to stop the process (e.g. because of time limits on HPC clusters). When sending a signal to *sqsgenerator* it does not crash but rather exit and write out the current state of the optimization. *sqsgenerator*'s core routine installs a temporary signal *SIGINT* handler which replaces Pythons default `KeyboardInterrupt`. Thus while executing the optimization you can always interrupt it by hitting **Ctrl+C**. You should get a warning that the program was interrupted ```{code-block} bash [warning]:do_pair_iterations::interrupt_message = "Received SIGINT/SIGTERM results may be incomplete" /media/DATA/drive/projects/sqsgenerator-core/sqsgenerator/public.py:137: UserWarning: SIGINT received: SQS results may be incomplete warnings.warn('SIGINT received: SQS results may be incomplete') ``` In case of MPI parallel both *SIGINT* and *SIGTERM* handlers are overwritten. Therefore, if you run *sqsgenerator* interactively using the `mpirun` command you can also gracefully terminate the process using **Ctrl+C**. ## A note on the number of `iterations` Actually it is very hard to tell what is a "**sufficiently**" large enough number for the `iteration` parameter. As the configuration space is growing extremely fast (factorial), it is anyway not possible to sample it properly in case the structures get large enough. To get a feeling how many structures are there, set `mode` to **systematic** and hit ```{code-block} bash sqsgen compute total-permutations ``` This will print you the number of different structures one can construct. This number might be really huge, however lots of the might be symmetrically equivalent. A few rules over the thumb, and what you can do if you deal with "*large*" systems - Maybe you have knowledge about the system: E. g certain species are restricted on different sub-lattices. - Check how long it would take to compute your current settings ```{code-block} bash sqsgen compute estimated-time ``` You can tune the number of permutations to a computing time you can afford. The above command gives only an estimate for the current machine. The above command analyzes $10^5$ random configurations and extrapolates it to the desired number of iterations. However, this value should be seen as an **upper bound**, as cycle times are slightly reduced for large number of iterations - Reduce the number of shells. This has two-fold advantage: 1. In contrast to old versions of *sqsgenerator*, the current implementations profit greatly from a decreased number of coordination shells. The actual speedup depends on the input structure but might be up to an order of magnitude when compared to the default value (all shells are considered) ```{image} images/time_vs_shells.svg :alt: Estimated time vs. number of coordination shells :width: 67% :align: center ``` 3. The image size of the objective function is drastically reduced. In other words a lot of different structures are mapped onto the same value of the objective function. ### A simple convergence-test For some general systems, which one uses often it might be useful, to know how many {ref}`iterations ` would be needed to get a converged result. The number of {ref}`iterations ` mainly depend on three factors: 1. **Cell size:** the configuration space grows very fast, see. Eq. {eq}`eqn:multinomial` 2. **Composition:** the size of the configuration space for a given cell size $N$, according to Eq. {eq}`eqn:multinomial`, will reach its maximum for close to equi-atomic compositions 3. **Number of coordination shells:** The objective function (Eq. {eq}`eqn:objective`) is a sum of the SRO parameters. Therefore, the more shells are considered, the larger the image domain of the objective function becomes. In other words, the more shells considered, the more {ref}`iterations ` will be needed. ````{admonition} Warning :class: warning, dropdown The following examples makes use of exhaustive enumeration, multiple times. Therefore, check if you can run an exhaustive enumeration in reasonable time using ```{code-block} bash sqsgen compute estimated-time exhaustive-setup.sqs.yaml ``` and benchmarking an systematic iteration. For large cells, such an convergence test might not be possible at due to too large configurational space Moreover, keep in mind, that such a convergence test is computationally demanding as it invloves a lot of different single-point runs. ```` ```{code-block} python --- lineno-start: 1 emphasize-lines: 7,11,15-19,23,24,26,32,34 caption: | Script to perform a "*convergence test*" for the parameters $N^{\mathrm{shells}}$ and $N^{\mathrm{iterations}}$. --- from matplotlib import pyplot as plt from sqsgenerator import sqs_optimize from operator import itemgetter as item from math import isclose, factorial as f, log10 # compute size of configurational space conf_space_size = f(36)/(f(24)*f(12)) NSHELLS=7 # max number of shells MIN_MAGNITUDE=4 # minimum number of iterations MAX_MAGNITUDE=int(log10(conf_space_size)) # maximum number of iterations SAMPLES=int(10**MIN_MAGNITUDE) # maximum number of structures # 36 atoms hcp with 12 Re and 24 W atoms settings = dict( structure=dict(file='ti-hex.vasp', supercell=[2, 2, 3]), composition=dict(W=12, Re=24), max_output_configurations=SAMPLES, ) test_results = dict() for shells in range(1, NSHELLS+1): settings['mode'] = 'systematic' # perform exhaustive search settings['shell_weights'] = {i: 1.0/i for i in range(1, shells + 1) } # compute the best value of the objective function by exhaustive enumeration sys_results, *_ = sqs_optimize(settings, fields=('objective',)) best_objective = min(sys_results.values(), key=item('objective')).get('objective') test_results[shells] = [] # create a list where we store for mag in range(MIN_MAGNITUDE, MAX_MAGNITUDE+1): settings['mode'] = 'random' settings['iterations'] = int(10**mag) results, *_ = sqs_optimize(settings, minimal=False, similar=True, fields=('objective',)) # compute percentage of structures that exhibit minimal objective percent = sum(isclose(r.get('objective'), best_objective) for r in results.values()) / len(results) * 100 test_results[shells].append((mag, percent)) def transpose(it) -> zip: return zip(*it) # visualize the data for shells, data in sorted(test_results.items(), key=item(0)): plt.plot(*transpose(data), marker='o', label=f'$S={shells}$') plt.axvline(log10(conf_space_size), color='k', label='exhaustive') plt.xlabel(r'$\log(N^{iter})$') plt.ylabel(r'$\frac{N^{best}}{N^{total}} [\%]$') plt.legend() plt.savefig('convergence_test.pdf') ``` - **Line 7:** compute the total number of iterations for the exhaustive search according to Eq.~{eq}`eqn:multinomial`. In the present case $N^{\text{iterations}} = \frac{36!}{12!24!} \approx 1.25 \cdot 10^9$ - **Line 8:** for the Monte-Carlo approach is does not make sense to go beyond $10^9$ iterations, as otherwise one could use exhaustive search anyway. - **Line 15-19:** setup up the configuration for *sqsgenerator*. Create a 48 atomic cell (replicate a 3 atomic [Ti](https://materialsproject.org/materials/mp-72) {download}`cell ` by $2 \times 2 \times 3$) and distribute 12 rhenium and 24 tungsten atoms. Rhenium and tungsten serve only as dummy species. - **Line 23-24:** at first we compute the best value of the objective function $\mathcal{O}(\sigma)$ (Eq. {eq}`eqn:objective`) for a defined number of `shells`. Therefore, we set the iteration {ref}`mode ` and {ref}`shell_weights ` accordingly. - **Line 26:** perform exhaustive enumeration - **Line 32:** perform Monte-Carlo sampling of the configuration space using for different number of iterations ranging from `10**MIN_MAGNITUDE` from `10**MAX_MAGNITUDE` - **Line 34:** compute the amount of structures from the Monte-Carlo approach, which exhibit the minimum objective. The bottom part of the listing above visualizes the results using [matplotlib](https://matplotlib.org/). The output from the above script might look something like the figure below. ```{image} images/convergence_test.svg :alt: Convergence of the optimization as a function of number of shell and number of iterations :width: 60% :align: center ``` Please remember that the figure above might look differently for a different system (lattice).