Providing Valid Inputs#
PyProbe uses Pydantic for input validation. This exists to ensure that the data provided is in the correct format to prevent unexpected errors. Most of the time, this will happen behind-the-scenes, so you will only notice it if there is a problem. This example is written to demonstrate how these errors may come about.
RawData Validation#
The RawData class is a specific variant of the Result object which only stores data in the standard PyProBE format. Therefore, validation is performed when a RawData object is created to verify this.
If you follow the standard method for importing data into PyProBE, you should never experience these errors, however it is helpful to know that they exist.
We will start with a normal dataset, printing the type that the procedure data is stored in:
[1]:
import pyprobe
import polars as pl
info_dictionary = {
"Name": "Sample cell",
"Chemistry": "NMC622",
"Nominal Capacity [Ah]": 0.04,
"Cycler number": 1,
"Channel number": 1,
}
data_directory = "../../../tests/sample_data/neware"
# Create a cell object
cell = pyprobe.Cell(info=info_dictionary)
cell.add_procedure(
procedure_name="Sample",
folder_path=data_directory,
filename="sample_data_neware.parquet",
)
print(type(cell.procedure["Sample"]))
<class 'pyprobe.filters.Procedure'>
The Procedure class inherits from RawData, which has a defined set of required columns (the PyProBE standard format):
[2]:
print(pyprobe.rawdata.required_columns)
['Time [s]', 'Step', 'Event', 'Current [A]', 'Voltage [V]', 'Capacity [Ah]']
Whenever a RawData class (or any of the filters module classes, that inherit from it) are created, the dataframe is checked against these required columns. We will create an example dataframe that is missing columns, which will be identified by the error that is returned.
[3]:
incorrect_dataframe = pl.DataFrame(
{
"Time [s]": [1, 2, 3],
"Voltage [V]": [3.5, 3.6, 3.7],
"Current [A]": [0.1, 0.2, 0.3],
}
)
pyprobe.rawdata.RawData(base_dataframe=incorrect_dataframe, info={})
2025-01-22 17:43:53,543 - pyprobe.rawdata - ERROR - Missing required columns: ['Step', 'Event', 'Capacity [Ah]']
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
Cell In[3], line 8
1 incorrect_dataframe = pl.DataFrame(
2 {
3 "Time [s]": [1, 2, 3],
(...)
6 }
7 )
----> 8 pyprobe.rawdata.RawData(base_dataframe=incorrect_dataframe, info={})
File ~/work/PyProBE/PyProBE/.venv/lib/python3.12/site-packages/pydantic/main.py:214, in BaseModel.__init__(self, **data)
212 # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
213 __tracebackhide__ = True
--> 214 validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
215 if self is not validated_self:
216 warnings.warn(
217 'A custom validator is returning a value other than `self`.\n'
218 "Returning anything other than `self` from a top level model validator isn't supported when validating via `__init__`.\n"
219 'See the `model_validator` docs (https://docs.pydantic.dev/latest/concepts/validators/#model-validators) for more details.',
220 stacklevel=2,
221 )
ValidationError: 1 validation error for RawData
base_dataframe
Value error, Missing required columns: ['Step', 'Event', 'Capacity [Ah]'] [type=value_error, input_value=shape: (3, 3)
┌──...───────┘, input_type=DataFrame]
For further information visit https://errors.pydantic.dev/2.10/v/value_error
You will also see a validation error if you try to create one of these classes with a data object that is not a Polars DataFrame or LazyFrame:
[4]:
incorrect_data_dict = {
"Time [s]": [1, 2, 3],
"Voltage [V]": [3.5, 3.6, 3.7],
"Current [A]": [0.1, 0.2, 0.3],
}
pyprobe.rawdata.RawData(base_dataframe=incorrect_data_dict, info={})
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
Cell In[4], line 6
1 incorrect_data_dict = {
2 "Time [s]": [1, 2, 3],
3 "Voltage [V]": [3.5, 3.6, 3.7],
4 "Current [A]": [0.1, 0.2, 0.3],
5 }
----> 6 pyprobe.rawdata.RawData(base_dataframe=incorrect_data_dict, info={})
File ~/work/PyProBE/PyProBE/.venv/lib/python3.12/site-packages/pydantic/main.py:214, in BaseModel.__init__(self, **data)
212 # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
213 __tracebackhide__ = True
--> 214 validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
215 if self is not validated_self:
216 warnings.warn(
217 'A custom validator is returning a value other than `self`.\n'
218 "Returning anything other than `self` from a top level model validator isn't supported when validating via `__init__`.\n"
219 'See the `model_validator` docs (https://docs.pydantic.dev/latest/concepts/validators/#model-validators) for more details.',
220 stacklevel=2,
221 )
ValidationError: 2 validation errors for RawData
base_dataframe.is-instance[LazyFrame]
Input should be an instance of LazyFrame [type=is_instance_of, input_value={'Time [s]': [1, 2, 3], '...t [A]': [0.1, 0.2, 0.3]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/is_instance_of
base_dataframe.is-instance[DataFrame]
Input should be an instance of DataFrame [type=is_instance_of, input_value={'Time [s]': [1, 2, 3], '...t [A]': [0.1, 0.2, 0.3]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/is_instance_of
Analysis Module Validation#
You are much more likely to experience validation errors when dealing with the functions and classes in the analysis module. These may require a particular PyProBE object to work.
As an example, the Cycling class requires an Experiment input. This is because it provides calculations based on the cycle() method of the experiment class:
[5]:
experiment_object = cell.procedure["Sample"].experiment("Break-in Cycles")
print(type(experiment_object))
<class 'pyprobe.filters.Experiment'>
The experiment object should return no errors:
[6]:
from pyprobe.analysis.cycling import Cycling
cycling = Cycling(input_data=experiment_object)
However, if I were to filter the object further, I would get an error:
[7]:
cycling = Cycling(input_data=experiment_object.cycle(1))
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
Cell In[7], line 1
----> 1 cycling = Cycling(input_data=experiment_object.cycle(1))
File ~/work/PyProBE/PyProBE/.venv/lib/python3.12/site-packages/pydantic/main.py:214, in BaseModel.__init__(self, **data)
212 # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
213 __tracebackhide__ = True
--> 214 validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
215 if self is not validated_self:
216 warnings.warn(
217 'A custom validator is returning a value other than `self`.\n'
218 "Returning anything other than `self` from a top level model validator isn't supported when validating via `__init__`.\n"
219 'See the `model_validator` docs (https://docs.pydantic.dev/latest/concepts/validators/#model-validators) for more details.',
220 stacklevel=2,
221 )
ValidationError: 1 validation error for Cycling
input_data
Input should be a valid dictionary or instance of Experiment [type=model_type, input_value=Cycle(base_dataframe=<Laz...hours']}, cycle_info=[]), input_type=Cycle]
For further information visit https://errors.pydantic.dev/2.10/v/model_type
Functions in the analysis module also contain type validation. This occurs on two levels. First, the inputs to the function are checked. E.g. for the gradient function of the differentiation module, input_data
is required as a PyProBE object, and string column names are required for x
and y
:
[8]:
from pyprobe.analysis import differentiation
gradient = differentiation.gradient(
input_data=cell.procedure["Sample"].experiment("Break-in Cycles").discharge(-1),
x="Capacity [Ah]",
y="Voltage [V]",
)
But if I provide an array to input_data, I will get an error to say that input_data should be one of many PyProBE objects:
[9]:
import numpy as np
gradient = differentiation.gradient(
input_data=np.zeros((10, 2)), x="Capacity [Ah]", y="Voltage [V]"
)
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
Cell In[9], line 3
1 import numpy as np
----> 3 gradient = differentiation.gradient(
4 input_data=np.zeros((10, 2)), x="Capacity [Ah]", y="Voltage [V]"
5 )
File ~/work/PyProBE/PyProBE/.venv/lib/python3.12/site-packages/pydantic/_internal/_validate_call.py:38, in update_wrapper_attributes.<locals>.wrapper_function(*args, **kwargs)
36 @functools.wraps(wrapped)
37 def wrapper_function(*args, **kwargs):
---> 38 return wrapper(*args, **kwargs)
File ~/work/PyProBE/PyProBE/.venv/lib/python3.12/site-packages/pydantic/_internal/_validate_call.py:111, in ValidateCallWrapper.__call__(self, *args, **kwargs)
110 def __call__(self, *args: Any, **kwargs: Any) -> Any:
--> 111 res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs))
112 if self.__return_pydantic_validator__:
113 return self.__return_pydantic_validator__(res)
ValidationError: 6 validation errors for gradient
input_data.RawData
Input should be a valid dictionary or instance of RawData [type=model_type, input_value=array([[0., 0.],
[..., 0.],
[0., 0.]]), input_type=ndarray]
For further information visit https://errors.pydantic.dev/2.10/v/model_type
input_data.Procedure
Input should be a valid dictionary or instance of Procedure [type=model_type, input_value=array([[0., 0.],
[..., 0.],
[0., 0.]]), input_type=ndarray]
For further information visit https://errors.pydantic.dev/2.10/v/model_type
input_data.Experiment
Input should be a valid dictionary or instance of Experiment [type=model_type, input_value=array([[0., 0.],
[..., 0.],
[0., 0.]]), input_type=ndarray]
For further information visit https://errors.pydantic.dev/2.10/v/model_type
input_data.Cycle
Input should be a valid dictionary or instance of Cycle [type=model_type, input_value=array([[0., 0.],
[..., 0.],
[0., 0.]]), input_type=ndarray]
For further information visit https://errors.pydantic.dev/2.10/v/model_type
input_data.Step
Input should be a valid dictionary or instance of Step [type=model_type, input_value=array([[0., 0.],
[..., 0.],
[0., 0.]]), input_type=ndarray]
For further information visit https://errors.pydantic.dev/2.10/v/model_type
input_data.Result
Input should be a valid dictionary or instance of Result [type=model_type, input_value=array([[0., 0.],
[..., 0.],
[0., 0.]]), input_type=ndarray]
For further information visit https://errors.pydantic.dev/2.10/v/model_type
Analysis functions will also check that the columns you require for the computation are present in the PyProBE objects provided. As an example, we will call the gradient()
method, requesting to differentiate a column that does not exist in the underlying data:
[10]:
gradient = differentiation.gradient(
input_data=cell.procedure["Sample"].experiment("Break-in Cycles").discharge(-1),
x="Temperature [C]",
y="Voltage [V]",
)
---------------------------------------------------------------------------
ValidationError Traceback (most recent call last)
Cell In[10], line 1
----> 1 gradient = differentiation.gradient(
2 input_data=cell.procedure["Sample"].experiment("Break-in Cycles").discharge(-1),
3 x="Temperature [C]",
4 y="Voltage [V]",
5 )
File ~/work/PyProBE/PyProBE/.venv/lib/python3.12/site-packages/pydantic/_internal/_validate_call.py:38, in update_wrapper_attributes.<locals>.wrapper_function(*args, **kwargs)
36 @functools.wraps(wrapped)
37 def wrapper_function(*args, **kwargs):
---> 38 return wrapper(*args, **kwargs)
File ~/work/PyProBE/PyProBE/.venv/lib/python3.12/site-packages/pydantic/_internal/_validate_call.py:111, in ValidateCallWrapper.__call__(self, *args, **kwargs)
110 def __call__(self, *args: Any, **kwargs: Any) -> Any:
--> 111 res = self.__pydantic_validator__.validate_python(pydantic_core.ArgsKwargs(args, kwargs))
112 if self.__return_pydantic_validator__:
113 return self.__return_pydantic_validator__(res)
File ~/work/PyProBE/PyProBE/pyprobe/analysis/differentiation.py:38, in gradient(input_data, x, y)
22 """Differentiate smooth data with a finite difference method.
23
24 A wrapper of the numpy.gradient function. This method calculates the gradient
(...)
35 calculated gradient.
36 """
37 # 2. Validate the inputs to the method
---> 38 validator = AnalysisValidator(
39 input_data=input_data,
40 required_columns=[x, y],
41 # required_type not neccessary here as type specified when declaring
42 # input_data attribute is strict enough
43 )
44 # 3. Retrieve the validated columns as numpy arrays
45 x_data, y_data = validator.variables
File ~/work/PyProBE/PyProBE/.venv/lib/python3.12/site-packages/pydantic/main.py:214, in BaseModel.__init__(self, **data)
212 # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
213 __tracebackhide__ = True
--> 214 validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
215 if self is not validated_self:
216 warnings.warn(
217 'A custom validator is returning a value other than `self`.\n'
218 "Returning anything other than `self` from a top level model validator isn't supported when validating via `__init__`.\n"
219 'See the `model_validator` docs (https://docs.pydantic.dev/latest/concepts/validators/#model-validators) for more details.',
220 stacklevel=2,
221 )
ValidationError: 1 validation error for AnalysisValidator
Value error, Quantities {'Temperature'} not in data. [type=value_error, input_value={'input_data': Step(base_...re [C]', 'Voltage [V]']}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/value_error