Contributing to the Analysis Module#
pyprobe.analysis
classes are classes that perform further analysis of the data.
This document describes the standard format to be used for all PyProBE analysis functions. Constructing your method in this way ensures compatibility with the rest of the PyProBE package, while keeping your code clean and easy to read.
Functions#
All calculations should be conducted inside methods. These are called by the user with
any additional information required to perform the analysis, and always return
Result
objects. We will use the
gradient()
method as an example.
It is recommended to use pydantic’s validate_call function decorator to ensure that objects of the correct type are being passed to your method. This provides the user with an error message if they have not called the method correctly, simplifying debugging.
The steps to write a method are as follows:
Define the method and its input parameters. One of these is likely to be a PyProBE object, which you can confirm has the necessary columns for your method with step 2.
Check that inputs to the method are valid with the
AnalysisValidator
class. Provide the class the input data to the method, the columns that are required for the computation to be performed and the required data type forinput_data
.If needed, you can retrieve the columns specified in the required_columns field as numpy arrays by accessing the
variables
attribute of the instance ofAnalysisValidator
.Perform the required computation. In this example, this is done with
np.gradient()
, a numpy built-in method. It is encouraged to perform as little of the underlying computation as possible in the analysis class method. Instead, write simple functions in thepyprobe.analysis.base
module that process only numpy arrays. This keeps the mathematical underpinnings of PyProBE analysis methods readable, portable and testable.Create a result object to return. This is easily done with the
clean_copy()
method, which provides a copy of the input data including the info attribute but replacing the data stored with a dataframe created from the provided dictionary.Add column definitions to the created result object.
Return the result object.
1@validate_call
2def gradient( # 1. Define the method
3 input_data: PyProBEDataType,
4 x: str,
5 y: str,
6) -> Result:
7 """Differentiate smooth data with a finite difference method.
8
9 A wrapper of the numpy.gradient function. This method calculates the gradient
10 of the data in the y column with respect to the data in the x column.
11
12 Args:
13 input_data:
14 The input data PyProBE object for the differentiation
15 x: The name of the x variable.
16 y: The name of the y variable.
17
18 Returns:
19 A result object containing the columns, `x`, `y` and the
20 calculated gradient.
21 """
22 # 2. Validate the inputs to the method
23 validator = AnalysisValidator(
24 input_data=input_data,
25 required_columns=[x, y],
26 # required_type not neccessary here as type specified when declaring
27 # input_data attribute is strict enough
28 )
29 # 3. Retrieve the validated columns as numpy arrays
30 x_data, y_data = validator.variables
31
32 # 4. Perform the computation
33 gradient_title = f"d({y})/d({x})"
34 gradient_data = np.gradient(y_data, x_data)
35
36 # 5. Create a Result object to store the results
37 gradient_result = input_data.clean_copy(
38 pl.DataFrame({x: x_data, y: y_data, gradient_title: gradient_data})
39 )
40 # 6. Define the column definitions for the Result object
41 gradient_result.column_definitions = input_data.column_definitions
42 gradient_result.column_definitions[gradient_title] = "The calculated gradient."
43 # 7. Return the Result object
44 return gradient_result
Base#
The pyprobe.analysis.base
module exists as a repository for functions to work in
the rest of the analysis module. Often with data analysis code, it is tempting to include
data manipulation (forming arrays, dataframes etc. from your standard data format)
alongside calculations. By keeping the data manipulation inside the methods
in the pyprobe.analysis
and calculations in the base
submodule, these functions remain more readable, testable and portable.
base
module functions should be defined as simply as possible,
accepting and returning only arrays and floating-point numbers, with clearly defined
variables. A good example is the
calc_electrode_capacities()
function
in the degradation_mode_analysis_functions
module:
1def calc_electrode_capacities(
2 x_pe_lo: float,
3 x_pe_hi: float,
4 x_ne_lo: float,
5 x_ne_hi: float,
6 cell_capacity: float,
7) -> Tuple[float, float, float]:
8 """Calculate the electrode capacities.
9
10 Args:
11 x_pe_lo (float): The cathode stoichiometry at lowest cell SOC.
12 x_pe_hi (float): The cathode stoichiometry at highest cell SOC.
13 x_ne_lo (float): The anode stoichiometry at lowest cell SOC.
14 x_ne_hi (float): The anode stoichiometry at highest cell SOC.
15 cell_capacity (float): The cell capacity.
16
17 Returns:
18 Tuple[float, float, float]:
19 - NDArray: The cathode capacity.
20 - NDArray: The anode capacity.
21 - NDArray: The lithium inventory.
22 """
23 pe_capacity = cell_capacity / (x_pe_lo - x_pe_hi)
24 ne_capacity = cell_capacity / (x_ne_hi - x_ne_lo)
25 li_inventory = (pe_capacity * x_pe_lo) + (ne_capacity * x_ne_lo)
26 return pe_capacity, ne_capacity, li_inventory