Example 1. Create your first Level 4 package
In this example, we assume Mr Neutron previously initiated a project called diffraction-utils
using scikit-package
Level 3 and developed a shared class called DiffractionObject
. This DiffractionObject
is used in code analyzing various diffraction data sourced from x-ray, neutron, and electron instruments. This class is written in a module called diffraction_objects.py
which is reused in Python scripts in the scattering
sub-project folder. Mr Neutron’s folder structure looks like the following:
somewhere/
|-- on/
|-- my/
|-- computer/
|-- diffraction-utils/
|-- README.md
|-- diffraction_objects.py
|-- scattering/
|-- __init__.py
|-- neutron.py
|-- xray.py
|-- electron.py
Since the project is a data analysis project, Mr Neutron followed common practice to place this diffraction-utils
folder near the location of the data. Since Mr Neutron wants to create a package that is shared across many different projects, it is recommended to place the package in a common directory where he keeps all his reused system-wide packages. This could be called anything, but following standard practice he called this directory dev (roughly short for code-development-area) that he placed in his home directory. In the example, Mr Neutron is working on Windows but using a bash terminal from the “Git for Windows”(https://git-scm.com/downloads/win) software. This is a recommended setup for Windows users.
Create a new empty project with scikit-package
Using the bash terminal, Mr Neutron navigates to his dev
directory, activates the Conda environment where scikit-package
is installed, and creates an empty Level 4 system project by typing these commands:
$ cd ~/dev
$ conda activate skpkg-env
$ package create system
Mr Neutron enters the following values to the questions scikit-package
prompts (no response indicates he hit enter to accept the default value in parentheses):
[1/6] project_name (my-project): diffraction-utils
[2/6] github_username_or_orgname (billingegroup): mrneutron44
[3/6] github_repo_name (my-project): diffraction-utils
[4/6] conda_pypi_package_dist_name (diffraction-utils):
[5/6] package_dir_name (diffraction_utils):
[6/6] contributors (Sangjoon Lee, Simon Billinge): Mr Neutron
Mr Neutron now sees the following empty package created on the hard-drive:
~/dev/
|-- diffraction-utils/
|-- .flake8
|-- .github
|-- ISSUE_TEMPLATE
|-- bug_feature.md
|-- workflows
|-- tests-on-pr.yml
|-- .gitignore
|-- .pre-commit-config.yaml
|-- LICENSE.rst
|-- README.md
|-- requirements
|-- conda.txt
|-- pip.txt
|-- tests.txt
|-- src/
|-- diffraction_utils/
|-- __init__.py
|-- functions.py
|-- tests
|-- test_functions.py
The created files are described in detail in the scikit-package
in (Level 4) Share your code as a locally installable Python package.
Copy files to the new project directory
At this point, scikit-package
simply created an empty package with files appropriately named based on Mr Neutron’s responses. Next Mr Neutron copies his code, the diffraction_objects.py
file into the diffraction_utils
folder. He can do this using Windows explorer, but chooses to do it in a terminalby typing the following commands:
$ cd ~/dev/diffraction-utils/diffraction_utils
$ cp somewhere/on/my/computer/diffraction-utils/diffraction_objects.py .
Mr Neutron made sure to type the small dot (“.”) at the end of the last command.
Install package and test
There are still a few quick steps that Mr Neutron needs to complete before the code is available everywhere in his computer.
First, Mr Neutron needs to specify the package dependencies within the conda.txt
and pip.txt
files in the requirements folder. In this example, Mr Neutron enters numpy and matplotlib-base in conda.txt
and numpy
and matplotlib
in pip.txt
, one per line. After he has done this the contents of those files can be seen using the bash less command, which prints the contents of a file, as Mr Neutron does below,
$ cd requirements
$ less conda.txt
numpy
matplotlib-base
$ less pip.txt
numpy
matplotlib
In general, conda.txt
and pip.txt
will contain the same list of dependencies. They are the dependencies that will be installed when installing from conda and PyPI, respectively. The reason we need separate files is that some packages have a different name on Conda and PyPI, respectively. For example, to install the lightest-weight vesrsion of matplotlib
, for historical reasons, the Conda package is called matplotlib-base
while it is matplotlib
on PyPI. Any other such differences in package names across conda-forge and PyPI can also be handled this way.
Second, Mr Neutron must build a virtual environment and install his new scikit-package
package in it. Mr Neutron decides to create a new conda environment dedicated for his diffraction-utils
package (he could have chosen to install the package in one of his existing environments). He first deactivates the skpkg-env
(conda deactivate) as he has finished the work of using scikit-package
to create a new project. He then creates a new conda environment called diff-utils-env
using Python 3.13, installing the dependencies listed under conda.txt
, and builds and installs his own diffraction-utils
package, using the following commands:
$ cd ~/dev/diffraction-utils
$ conda create -n diff-utils-env python=3.13
$ conda activate diff-utils-env
$ conda install --file requirements/conda.txt
$ pip install . --no-deps
To test the installed package, Mr Neutron imports the DiffractionObject
class from the diffraction-utils
package in a Python module called neutron.py
, located in the folder path /data-analysis/neutron-experiment/
. Mr Neutron writes the following line at the top of the neutron.py
file to import the DiffractionObject
class,
# ~/data-analysis/neutron-experiment/neutron.py
from diffraction_utils.diffraction_objects import DiffractionObject
without having to change any of the other code. With the diff-utils-env
environment activated, Mr Neutron can run the code below:
# cd ~/data-analysis/neutron-experiment
$ python neutron.py
We note that, since Mr Neutron is the developer and wants to update the code as he is working, he can installs the package in his environment in “editable” mode (recommended) where he replaces the command pip install .
with the following:
$ pip install -e .
As long as Mr Neutron has activated the diff-utils-env
conda environment, whenever he runs code anywhere on his computer it will run the version of the code it finds on disc at run-time without him having to reinstall the package. This is very convenient for developers, but is not the preferred installation method for users.
Use Git to track changes
As the next step, Mr Neutron wants to continuously maintain the diffraction-utils
package. The best way to do this is to use Git. It is beyond the scope of this article to explain Git in detail, but the main concept is that Git maintains a database on the computer with every version of every file in the user’s project, so the user never loses work and can find any earlier version. Every time Mr Neutron “commits” the edits to the Git database, it stores the edits.
To set up Git, Mr Neutron types the following command once to initiate the Git database for the project folder (including subdirectories):
# ~/dev/diffraction-utils
$ git init
Mr Neutron then creates the first commit through the following commands:
$ git add .
$ git commit -m "initial commit of the package files"
The git add .
command adds all the files in the current directory (including subdirectories) to the list of files that will be committed to the database next time the user makes a commit. The git commit
command actually commits those edits and changes to the database with a clear commit message describing the edits.
Set up pre-commit hooks to automatically check syntax
Once the local repository is under Git control, Mr Neutron wants to ensure that the code is properly formatted before committing to the Git database. This can be done by triggering pre-commit
to run each time a new commit is attempted. This is done with a pre-commit
hook. A hook automatically runs a program, or programs, before making a Git commit, every time Mr Neutron runs git commit
. scikit-package
uses the pre-commit
package to manage this. To get this set up, Mr Neutron (who has already installed the pre-commit
package in his environment with conda install pre-commit
), types this command:
$ precommit install
Now, every time Mr Neutron runs git commit -m “<commit message>”, he will see in the terminal the hooks being executed, for example:
$ git commit -m "chore: implement local precommit hooks"
black...........................................................Passed
prettier........................................................Passed
docformatter....................................................Passed
Mr Neutron proceeds to add new features and bug-fixes to the diffraction-utils package and committing the changes, but sometimes pre-commit
hooks fail. If this happens, Mr Neutron will see that the most recent commit was not written to the Git database, for example by using the git log
command. Mr Neutron can then fix those errors manually and rerun git commit
.
At any time, Mr Neutron can use the pre-commit run --all-files
command to trigger pre-commit
manually while fixing those errors so that he does not have to make a commit to run the checks. As discussed in Section 2, pre-commit
auto-fixes based on the configurations provided in .pre-commit-config.yaml
in the project directory. The output from pre-commit
informs Mr Neutron which files, and which line in the file, caused the error, helping him fix everything up.
Use GitHub to backup code online
Having successfully made edits to the code and commit them to his local Git database, Mr Neutron now wants to back-up his work online. scikit-package
is integrated with GitHub which is a cloud-based platform for uploading and sharing Git projects.
Mr Neutron first creates a new repository on GitHub in his user space, mrneutron(he could create it in any organization that he owns) and enters diffraction-utils for the repository name. He selects the option to create an empty repository (without an auto- generated README
, .gitignore
, or LICENSE
file) since these files are already created with scikit-package
. Mr Neutron connects the local to the remote (cloud) repository by typing:
# ~/dev/diffraction-utils
$ git remote add origin https://github.com/mrneutron/diffraction-utils.git
The term origin is an alias (name) for the remote repository. Mr Neutron then uploads the local repository content to the remote repository with:
$ git push --set-upstream origin main
Mr Neutron can view the content uploaded to the remote GitHub repository. He can also make edits directly on the remote repository and synchronize those changes with the local repository using the git pull origin main
command.