Importing Data#

Making a cell object#

PyProBE stores all experimental data and information in a pyprobe.cell.Cell object. It has two main attributes:

a dictionary of cell details and experimental info (pyprobe.cell.Cell.info)
a dictionary of experimental procedures performed on the cell (pyprobe.cell.Cell.procedure).

A cell object can be created by providing an info dictionary as a keyword argument to info:

import pyprobe

# Describe the cell. Required fields are 'Name'.
info_dictionary = {'Name': 'Sample cell',
                   'Chemistry': 'NMC622',
                   'Nominal Capacity [Ah]': 0.04,
                   'Cycler number': 1,
                   'Channel number': 1,}

# Create a cell object
cell = pyprobe.Cell(info = info_dictionary)

The info dictionary can contain any number of key-value pairs that provide metadata to identify the cell and the conditions it was tested under.

Converting data to PyProBE Format#

PyProBE defines a Procedure as a dataset collected from a single run of an experimental protocol created on a battery cycler. Throughout its life, a cell will likely undergo multiple procedures, such as beginning-of-life testing, degradation cycles, reference performance tests (RPTs) etc.

Before adding data to a cell object, it must be converted into the PyProBE standard format. This is done with the process_cycler_file() method:

# From the previously created cell instance
cell.process_cycler_file(cycler = 'neware',
                         folder_path = 'path/to/root_folder/experiment_folder',
                         input_filename = 'cycler_file.csv',
                         output_filename = 'processed_cycler_file.parquet')

Your parquet file will be saved in the same directory (folder_path) as the input file. Once converted into this format, PyProBE is agnostic to cycler manufacturer and model. For more details on the PyProBE standard format, and an up-to-date list of supported cyclers, see the Input Data Guidance section.

Working with multiple input files#

Some cyclers may output data in multiple files. For example, BioLogic Modulo Bat procedures. Assuming the data is all in the same folder, PyProBE is able to collect all of the files and process them into a single parquet file. This is done by providing a * wildcard in the input_filename argument:

# From the previously created cell instance
cell.process_cycler_file(cycler = 'neware',
                         folder_path = 'path/to/root_folder/experiment_folder',
                         input_filename = 'cycler_file*.csv',
                         output_filename = 'processed_cycler_file.parquet')

This will process all files in the folder that match the pattern cycler_file*.csv, e.g. cycler_file_1.csv, cycler_file_2.csv, etc.

The Biologic Modulo Bat format has its own reader 'biologic_MB':

cell.process_cycler_file(cycler = 'biologic_MB',
                         folder_path = 'path/to/root_folder/experiment_folder',
                         input_filename = 'cycler_file_*_MB.mpt',
                         output_filename = 'processed_cycler_file.parquet')

Adding data to a cell object#

For data to be imported into PyProBE, there should be a corresponding README.yaml file in the same directory as the data file. This file contains details of the experimental procedure that generated the data. See the Writing a README file section for guidance.

A data file in the standard PyProBE format can be added to a cell object using the add_procedure() method. A procedure must be given a name when it is imported. Choose something descriptive, so it is easy to distinguish between different procedures that have been run on the same cell.

# Add the processed data to the cell object
cell.add_procedure(procedure_name = 'Example procedure',
                   folder_path = 'path/to/root_folder/experiment_folder',
                   filename = 'processed_cycler_file.parquet')

Any number of procedures can be added to a cell, for example:

# Add the first procedure
cell.add_procedure(procedure_name = 'Cycling',
                   folder_path = 'path/to/root_folder/experiment_folder',
                   filename = 'processed_cycler_file_cycling.parquet')

# Add the second procedure
cell.add_procedure(procedure_name = 'RPT',
                   folder_path = 'path/to/root_folder/experiment_folder',
                   filename = 'processed_cycler_file_RPT.parquet')

print(cell.procedure)
# Returns: dict({'Cycling': <pyprobe.procedure.Procedure object…, 'RPT': <pyprobe.procedure.Procedure object…})

If you want to load data quickly, for simple analysis or viewing, the quick_add_procedure() method allows for importing without a README.yaml file.

Batch preprocessing#

If you have multiple cells undergoing the same experimental procedures, you can use the built-in batch processing functionality in PyProBE to speed up your workflow. You must first create a list of Cell objects.

The fastest way to do this is to store an Experiment Record alongside your data. This is an Excel file that contains important experimental information about your cells and the procedures they have undergone. See the Writing an Experiment record section for guidance.

Once you have an Experiment Record, you can create a list of cells using the make_cell_list() function:

cell_list = pyprobe.make_cell_list(record_filepath = 'path/to/experiment_record.xlsx',
                                   worksheet_name = 'Sample experiment')

This function creates a list of cells, where the info dictionary is populated with the information from the Experiment Record.

You can then add procedures to each cell in the list. add_procedure() includes the functionality to do this parametrically. The steps are as follows:

Define a function that generates the filename for each cell.
Assign the filename generator function to the filename argument in add_procedure().
Provide the inputs to the filename generator function in the filename_inputs argument. The order of the inputs must match the order of the arguments in the filename generator function. These inputs must be keys of the info dictionary. This means that they are likely to be column names in the Experiment Record Excel file.

# Define functions that generates the filename for each cell
def input_name_generator(cycler, channel):
    return f'cycler_file_{cycler}_{channel}.csv'

def output_name_generator(cycler, channel):
    return f'processed_cycler_file_{cycler}_{channel}.parquet'

# Convert the data to PyProBE format and add the procedure to each cell in the list
for cell in cell_list:
    cell.process_cycler_file(cycler = 'neware',
                             folder_path = 'path/to/root_folder/experiment_folder',
                             input_filename = input_name_generator,
                             output_filename = output_name_generator,
                             filename_inputs = ["Cycler", "Channel"])

    cell.add_procedure(procedure_name = 'Cycling',
                       folder_path = 'path/to/root_folder/experiment_folder',
                       filename = output_name_generator,
                       filename_inputs = ["Cycler", "Channel"])

Adding data not from a cycler#

In your battery experiment, it is likely that you will be collecting data from sources additional to your battery cycler. This can be added to your Procedure object after it has been created with its add_external_data() method.

The data that you provide must be timeseries, with a column that can be interpreted in DateTime format. This is usually a string that may appear like: "2024-02-29 09:19:58.554". PyProBE will interpolate your data into the time series of the cycling data already there, so it can be filtered as normal.