Creating a new project¶
We recommend that you create projects according to the Kedro default project template, which is ideal for analytics projects and comes with a default folder structure for storing datasets, folders for notebooks, configuration and source code.
Projects can be created interactively or by referencing a configuration file.
You can also work with a Kedro project that has already been created. In this case, you don’t need to create a new Kedro project, but can usegit clone
to clone the existing project.
Create a new project interactively¶
First, select the directory in which you want to work, and if you are using conda
, make sure you have the correct environment activated:
conda activate environment_name
You are then ready to create a new project.
Call kedro new
to create a new project in your current working directory (<current_dir>/<repo_name>/
):
kedro new
You will need to provide the following variables:
project_name
- A human readable name for your new projectrepo_name
- A name for the directory that holds your project repositorypython_package
- A Python package name for your project package (see Python package naming conventions)include_example
- Confirms or rejects the inclusion of example code. If you enterY
to include an example then your new project template contains a small example to get you going. See the Hello World example for further details
Create a new project from a configuration file¶
You can also create a new project from a configuration file by running:
kedro new --config config.yml
The configuration file (config.yml
) must contain the project_name
, repo_name
, python_package
and include_example
(Boolean value) variables as described above as well as output_dir
- path to the directory where the project folder will be created.
Here is an example config.yml
, which assumes that a directory named ~/code
already exists:
output_dir: ~/code
project_name: Getting Started
repo_name: getting-started
python_package: getting_started
include_example: true
output_dir
can be set to ~
for home directory, or .
for the current working directory.
Create a new project using starters¶
Kedro supports using custom starter templates to create your project via the --starter
flag. To learn more about this feature, please read the guide to creating new projects with Kedro Starters.
Working with your new project¶
Initialise a git
repository¶
Having created a new project, you may want to set up a new git
repository by calling:
git init
Amend project-specific dependencies¶
Using kedro build-reqs
¶
Once you have created a new project, you can update its dependencies. The generic project template bundles some typical dependencies, in src/requirements.txt
.
On the first use of your project, if you want to add or remove dependencies, edit src/requirements.txt
.
Then run the following:
kedro build-reqs
The build-reqs
command will:
- Generate
src/requirements.in
from the contents ofsrc/requirements.txt
pip-compile
the requirements listed insrc/requirements.in
- Regenerate
src/requirements.txt
to specify a list of pinned project dependencies (those with a strict version).
Note:src/requirements.in
contains “source” requirements, whilesrc/requirements.txt
contains the compiled version of those and requires no manual updates. To further update the project requirements, you should modifysrc/requirements.in
(notsrc/requirements.txt
) and re-runkedro build-reqs
.
If your project has conda
dependencies, you can create a src/environment.yml
file and list them there.
Using kedro install
¶
To install the project-specific dependencies, run the following command:
kedro install
Note:kedro install
automatically compiles project dependencies by runningkedro build-reqs
behind the scenes before the installation if thesrc/requirements.in
file doesn’t exist. To skip the compilation step and install requirements as-is, runkedro install --no-build-reqs
. To force the compilation even ifsrc/requirements.in
exists, runkedro install --build-reqs
.