Importing Data#
Making a cell object#
PyProBE stores all experimental data and information in a pyprobe.cell.Cell
object. It has two main attributes:
a dictionary of cell details and experimental info (
pyprobe.cell.Cell.info
)a dictionary of experimental procedures performed on the cell (
pyprobe.cell.Cell.procedure
).
A cell object can be created by providing an info dictionary as a keyword argument to
info
:
import pyprobe
# Describe the cell. Required fields are 'Name'.
info_dictionary = {'Name': 'Sample cell',
'Chemistry': 'NMC622',
'Nominal Capacity [Ah]': 0.04,
'Cycler number': 1,
'Channel number': 1,}
# Create a cell object
cell = pyprobe.Cell(info = info_dictionary)
The info
dictionary can contain any number of key-value pairs that provide
metadata to identify the cell and the conditions it was tested under.
Converting data to PyProBE Format#
PyProBE defines a Procedure as a dataset collected from a single run of an experimental protocol created on a battery cycler. Throughout its life, a cell will likely undergo multiple procedures, such as beginning-of-life testing, degradation cycles, reference performance tests (RPTs) etc.
Before adding data to a cell object, it must be converted into the PyProBE standard
format. This is done with the process_cycler_file()
method:
# From the previously created cell instance
cell.process_cycler_file(cycler = 'neware',
folder_path = 'path/to/root_folder/experiment_folder',
input_filename = 'cycler_file.csv',
output_filename = 'processed_cycler_file.parquet')
Your parquet file will be saved in the same directory (folder_path
) as the input
file. Once converted into this format, PyProBE is agnostic to cycler manufacturer
and model. For more details on the PyProBE standard format, and an up-to-date list of
supported cyclers, see the Input Data Guidance section.
Working with multiple input files#
Some cyclers may output data in multiple files. For example, BioLogic Modulo Bat
procedures. Assuming the data is all in the same folder, PyProBE is able to collect all
of the files and process them into a single parquet file. This is done by providing a
*
wildcard in the input_filename
argument:
# From the previously created cell instance
cell.process_cycler_file(cycler = 'neware',
folder_path = 'path/to/root_folder/experiment_folder',
input_filename = 'cycler_file*.csv',
output_filename = 'processed_cycler_file.parquet')
This will process all files in the folder that match the pattern
cycler_file*.csv
, e.g. cycler_file_1.csv
, cycler_file_2.csv
,
etc.
The Biologic Modulo Bat format has its own reader 'biologic_MB'
:
cell.process_cycler_file(cycler = 'biologic_MB',
folder_path = 'path/to/root_folder/experiment_folder',
input_filename = 'cycler_file_*_MB.mpt',
output_filename = 'processed_cycler_file.parquet')
Adding data to a cell object#
For data to be imported into PyProBE, there should be a corresponding README.yaml
file in the same directory as the data file. This file contains details of the
experimental procedure that generated the data. See the Writing a README file
section for guidance.
A data file in the standard PyProBE format can be added to a cell object using the
add_procedure()
method. A procedure must be given a name when
it is imported. Choose something descriptive, so it is easy to distinguish between
different procedures that have been run on the same cell.
# Add the processed data to the cell object
cell.add_procedure(procedure_name = 'Example procedure',
folder_path = 'path/to/root_folder/experiment_folder',
filename = 'processed_cycler_file.parquet')
Any number of procedures can be added to a cell, for example:
# Add the first procedure
cell.add_procedure(procedure_name = 'Cycling',
folder_path = 'path/to/root_folder/experiment_folder',
filename = 'processed_cycler_file_cycling.parquet')
# Add the second procedure
cell.add_procedure(procedure_name = 'RPT',
folder_path = 'path/to/root_folder/experiment_folder',
filename = 'processed_cycler_file_RPT.parquet')
print(cell.procedure)
# Returns: dict({'Cycling': <pyprobe.procedure.Procedure object…, 'RPT': <pyprobe.procedure.Procedure object…})
If you want to load data quickly, for simple analysis or viewing, the quick_add_procedure()
method allows for importing without a README.yaml
file.
Batch preprocessing#
If you have multiple cells undergoing the same experimental procedures, you can use the
built-in batch processing functionality in PyProBE to speed up your workflow. You must
first create a list of Cell
objects.
The fastest way to do this is to store an Experiment Record alongside your data. This is an Excel file that contains important experimental information about your cells and the procedures they have undergone. See the Writing an Experiment record section for guidance.
Once you have an Experiment Record, you can create a list of cells using the
make_cell_list()
function:
cell_list = pyprobe.make_cell_list(record_filepath = 'path/to/experiment_record.xlsx',
worksheet_name = 'Sample experiment')
This function creates a list of cells, where the info
dictionary is populated with the information from the Experiment Record.
You can then add procedures to each cell in the list.
add_procedure()
includes the functionality to do this
parametrically. The steps are as follows:
Define a function that generates the filename for each cell.
Assign the filename generator function to the
filename
argument inadd_procedure()
.Provide the inputs to the filename generator function in the
filename_inputs
argument. The order of the inputs must match the order of the arguments in the filename generator function. These inputs must be keys of theinfo
dictionary. This means that they are likely to be column names in the Experiment Record Excel file.
# Define functions that generates the filename for each cell
def input_name_generator(cycler, channel):
return f'cycler_file_{cycler}_{channel}.csv'
def output_name_generator(cycler, channel):
return f'processed_cycler_file_{cycler}_{channel}.parquet'
# Convert the data to PyProBE format and add the procedure to each cell in the list
for cell in cell_list:
cell.process_cycler_file(cycler = 'neware',
folder_path = 'path/to/root_folder/experiment_folder',
input_filename = input_name_generator,
output_filename = output_name_generator,
filename_inputs = ["Cycler", "Channel"])
cell.add_procedure(procedure_name = 'Cycling',
folder_path = 'path/to/root_folder/experiment_folder',
filename = output_name_generator,
filename_inputs = ["Cycler", "Channel"])
Adding data not from a cycler#
In your battery experiment, it is likely that you will be collecting data from sources
additional to your battery cycler. This can be added to your Procedure
object after it has been created with its add_external_data()
method.
The data that you provide must be timeseries, with a column that can be interpreted in
DateTime format. This is usually a string that may appear like: "2024-02-29 09:19:58.554"
.
PyProBE will interpolate your data into the time series of the cycling data already there,
so it can be filtered as normal.