functions
importing.functions
¤
Attributes¤
one_second = np.timedelta64(1, 's')
module-attribute
¤
unix_epoch = np.datetime64(0, 's')
module-attribute
¤
Classes¤
Classification
¤
Bases: PermissionsBase
Contains instructions on how to classify the data into a specific variable.
In particular, it links a format to a variable, and provides the column indices for the value, maximum, and minimum columns, as well as the validator columns. It also contains information on whether the data is accumulated, incremental, and the resolution of the data. For Thingsboard imports, only the format, variable, accumulate, resolution and incremental fields are applicable.
Attributes:
| Name | Type | Description |
|---|---|---|
cls_id |
AutoField
|
Primary key. |
format |
ForeignKey
|
The format of the data file. |
variable |
ForeignKey
|
The variable to which the data belongs. |
value |
PositiveSmallIntegerField
|
Index of the value column, starting in 0. |
maximum |
PositiveSmallIntegerField
|
Index of the maximum value column, starting in 0. |
minimum |
PositiveSmallIntegerField
|
Index of the minimum value column, starting in 0. |
value_validator_column |
PositiveSmallIntegerField
|
Index of the value validator column, starting in 0. |
value_validator_text |
CharField
|
Value validator text. |
maximum_validator_column |
PositiveSmallIntegerField
|
Index of the maximum value validator column, starting in 0. |
maximum_validator_text |
CharField
|
Maximum value validator text. |
minimum_validator_column |
PositiveSmallIntegerField
|
Index of the minimum value validator column, starting in 0. |
minimum_validator_text |
CharField
|
Minimum value validator text. |
accumulate |
PositiveSmallIntegerField
|
If set to a number of minutes, the data will be accumulated over that period. |
resolution |
DecimalField
|
Resolution of the data. Only used if it is to be accumulated. |
incremental |
BooleanField
|
Whether the data is an incremental counter. If it is, any value below the previous one will be removed. |
decimal_comma |
BooleanField
|
Whether the data uses a comma as a decimal separator. |
Functions¤
__str__()
¤
Return the string representation of the object.
Source code in formatting/models.py
464 465 466 | |
clean()
¤
Validate the model instance.
It checks that the column indices are different, and that the accumulation period is greater than zero if it is set; the resolution is set if the data is accumulated; and that the value column is set if the import is not from Thingsboard.
Source code in formatting/models.py
472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 | |
get_absolute_url()
¤
Get the absolute URL of the object.
Source code in formatting/models.py
468 469 470 | |
DataImport
¤
Bases: PermissionsBase
Model to store the data imports.
This model stores the data imports, which are, often, files with data that are uploaded to the system. The data is then processed asynchronously and stored in the database.
Attributes:
| Name | Type | Description |
|---|---|---|
station |
ForeignKey
|
Station to which the data belongs. |
format |
ForeignKey
|
Format of the data. |
rawfile |
FileField
|
File with the data to be imported. |
date |
DateTimeField
|
Date of submission of the data. |
start_date |
DateTimeField
|
Start date of the data. |
end_date |
DateTimeField
|
End date of the data. |
records |
IntegerField
|
Number of records in the data. |
observations |
TextField
|
Notes or observations about the data. |
status |
TextField
|
Status of the import. |
log |
TextField
|
Log of the data ingestion, indicating any errors. |
Functions¤
clean()
¤
Validate information and uploads the measurement data.
Source code in importing/models.py
138 139 140 141 142 143 144 145 | |
Format
¤
Bases: PermissionsBase
Details of the data file format, describing how to read the file.
It combines several properties, such as the file extension, the delimiter, the date and time formats, and the column indices for the date and time columns, instructing how to read the data file and parse the dates. It is mostly used to ingest data from text files, like CSV. For Thingsboard imports, only the name, description and thingsboard fields are applicable.
Attributes:
| Name | Type | Description |
|---|---|---|
format_id |
AutoField
|
Primary key. |
name |
CharField
|
Short name of the format entry. |
description |
TextField
|
Description of the format. |
extension |
ForeignKey
|
The extension of the data file. |
delimiter |
ForeignKey
|
The delimiter between columns in the data file. Only required for text files. |
first_row |
PositiveSmallIntegerField
|
Index of the first row with data, starting in 0. |
footer_rows |
PositiveSmallIntegerField
|
Number of footer rows to be ignored at the end. |
date |
ForeignKey
|
Format for the date column. Only required for text files. |
date_column |
PositiveSmallIntegerField
|
Index of the date column, starting in 0. |
time |
ForeignKey
|
Format for the time column. Only required for text files. |
time_column |
PositiveSmallIntegerField
|
Index of the time column, starting in 0. |
thingsboard |
BooleanField
|
Whether the data is being imported from Thingsboard. |
Attributes¤
datetime_format
property
¤
Obtain the datetime format string.
Functions¤
__str__()
¤
Return the string representation of the object.
Source code in formatting/models.py
272 273 274 | |
clean()
¤
Validate the model instance.
Checks that the required fields for non-Thingsboard data are provided.
Source code in formatting/models.py
303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 | |
datetime_columns(delimiter)
¤
Column indices that correspond to the date and time columns in the dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
delimiter
|
str
|
The delimiter used to split the date and time codes. |
required |
Returns:
| Type | Description |
|---|---|
list[int]
|
list[int]: A list of column indices. |
Source code in formatting/models.py
285 286 287 288 289 290 291 292 293 294 295 296 297 298 | |
get_absolute_url()
¤
Get the absolute URL of the object.
Source code in formatting/models.py
276 277 278 | |
Measurement
¤
Bases: MeasurementBase
Class to store the measurements and their validation status.
This class holds the value of a given variable and station at a specific time, as
well as auxiliary information such as maximum and minimum values, depth and
direction, for vector quantities. All of these have a raw version where a backup
of the original data is kept, should this change at any point.
Flags to monitor its validation status, if the data is active (and therefore can be used for reporting) and if it has actually been used for that is also included.
Attributes:
| Name | Type | Description |
|---|---|---|
depth |
int
|
Depth of the measurement. |
direction |
Decimal
|
Direction of the measurement, useful for vector quantities. |
raw_value |
Decimal
|
Original value of the measurement. |
raw_maximum |
Decimal
|
Original maximum value of the measurement. |
raw_minimum |
Decimal
|
Original minimum value of the measurement. |
raw_direction |
Decimal
|
Original direction of the measurement. |
raw_depth |
int
|
Original depth of the measurement. |
is_validated |
bool
|
Flag to indicate if the measurement has been validated. |
is_active |
bool
|
Flag to indicate if the measurement is active. An inactive measurement is not used for reporting |
Attributes¤
overwritten
property
¤
Indicates if any of the values associated to the entry have been overwritten.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if any raw field is different to the corresponding standard field. |
raws
property
¤
Return the raw fields of the measurement.
Returns:
| Type | Description |
|---|---|
tuple[str, ...]
|
tuple[str]: Tuple with the names of the raw fields of the measurement. |
Functions¤
clean()
¤
Check consistency of validation, reporting and backs-up values.
Source code in measurement/models.py
259 260 261 262 263 264 265 266 267 268 269 | |
Report
¤
Bases: MeasurementBase
Holds the different reporting data.
It also keeps track of which data has already been used when creating the reports.
Attributes:
| Name | Type | Description |
|---|---|---|
report_type |
str
|
Type of report. It can be hourly, daily or monthly. |
completeness |
Decimal
|
Completeness of the report. Eg. a daily report with 24 hourly measurements would have a completeness of 100%. |
Functions¤
clean()
¤
Validate that the report type and use of the data is consistent.
Source code in measurement/models.py
147 148 149 150 151 152 153 154 155 156 | |
Functions¤
construct_matrix(data_import)
¤
Creates dataframes containing the processed data for each variable.
Checks classifications exist for the file format and that there are enough columns in the data file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_import
|
DataImport
|
The DataImport object. |
required |
Returns:
| Type | Description |
|---|---|
tuple[Timestamp, Timestamp, list[tuple[int, DataFrame]]]
|
The start and end dates and a list of tuples containing the variable ID and the associated dataframe containing the variable data. |
Source code in importing/functions.py
254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 | |
get_processed_variable_data(matrix, classification, start_date, end_date, thingsboard=False)
¤
Returns the data table for a given variable, performing necessary validation and data processing steps.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
matrix
|
DataFrame
|
the preformatted matrix containing the raw data. |
required |
classification
|
Classification
|
a formatting.Classification object. |
required |
start_date
|
Timestamp
|
the start date of the data being imported. |
required |
end_date
|
Timestamp
|
the end date of the data being imported. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
The processed dataframe for the given classification. |
Source code in importing/functions.py
467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 | |
parse_thingsboard_values(data)
¤
Parse the values column for Thingsboard dataframe.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
the dataframe containing data to parse. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
The dataframe with a numeric values column. |
Source code in importing/functions.py
448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 | |
process_cumulative_data(data, classification, acc, start_date, end_date)
¤
Processes cumulative time series data aggregates over specified time periods.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Dataframe
|
Dataframe containing validated data to be processed. |
required |
classification
|
Classification
|
a formatting.Classification object. |
required |
acc
|
int
|
The accumulation period in minutes. |
required |
start_date
|
Timestamp
|
the start date of the data being imported. |
required |
end_date
|
Timestamp
|
the end date of the data being imported. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
The processed dataframe with cumulative data. |
Source code in importing/functions.py
402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 | |
process_datetime_columns(data, file_format, timezone)
¤
Process the datetime columns in a DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
The DataFrame to process. |
required |
file_format
|
Format
|
The file format. |
required |
timezone
|
str
|
The timezone to use. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
The DataFrame with the datetime columns processed. |
Source code in importing/functions.py
95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | |
process_incremental_data(data)
¤
Processes incremental time series data.
If incremental, it is assumed to only work with 'value' columns; maximum and minimum are excluded.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Dataframe
|
the dataframe containing validated data to be processed. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
The processed dataframe with incremental data. |
Source code in importing/functions.py
385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 | |
read_data_to_import(source_file, file_format, timezone)
¤
Reads the data from file into a pandas DataFrame.
Works out what sort of file is being read and adds standardised columns for datetime.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_file
|
Any
|
Stream of data to be parsed. |
required |
file_format
|
Format
|
Format of the data to be parsed. |
required |
timezone
|
str
|
Timezone name, eg. 'America/Chicago'. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
The DataFrame with raw data read and extra column(s) for datetime correctly parsed. |
Source code in importing/functions.py
130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 | |
read_file_csv(source_file, file_format)
¤
Reads a CSV file into a pandas DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_file
|
Any
|
Stream of data to be parsed. |
required |
file_format
|
Format
|
The file format. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
A pandas DataFrame containing the data from the file. |
Source code in importing/functions.py
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 | |
read_file_excel(file_path, file_format)
¤
Reads an Excel file into a pandas DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_path
|
str
|
The path to the file to be read. |
required |
file_format
|
Format
|
The file format. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
A pandas DataFrame containing the data from the file. |
Source code in importing/functions.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | |
read_thingsboard_data_to_import(source_file, timezone)
¤
Reads the data from a Thingsboard json file into a pandas DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_file
|
The path to the json file. |
required | |
timezone
|
str
|
The station timezone. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
The DataFrame with raw data read and datetime parsed. |
Source code in importing/functions.py
155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 | |
remove_nan_rows(data, classification, columns)
¤
Cleans the dataframe by removing rows composed of only nan values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
the dataframe to be cleaned. |
required |
classification
|
Classification
|
a formatting.Classification object. |
required |
columns
|
list[tuple[str, str]]
|
A mapping for the validated columns. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
The cleaned dataframe. |
Source code in importing/functions.py
360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 | |
save_temp_data_to_permanent(data_import)
¤
Function to pass the temporary import to the final table.
This function carries out the following steps: - Bulk delete of existing data between two times on a given measurement table for the station in question. - Bulk create to add the new data from the uploaded file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_import
|
DataImport
|
The DataImport object. |
required |
Returns:
| Type | Description |
|---|---|
tuple[datetime, datetime, int]
|
A tuple containing the start date, end date and number of records inserted. |
Source code in importing/functions.py
180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 | |
standardise_floats(data, classification)
¤
Standardises floats and commas.
If a period is used as a decimal separator, commas are removed. If a comma is used, periods are removed and commas replaced with periods. Columns are then converted to numeric type. Note: this assumes that all values are formatted in the same way.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
the dataframe containing data to standardise. |
required |
classification
|
Classification
|
a formatting.Classification object. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
The dataframe with data now standardised. |
Source code in importing/functions.py
505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 | |
validate_values(matrix, classification)
¤
Validates the values, maxima and minima according to the classification model, and renames the columns to standard names.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
matrix
|
DataFrame
|
the preformatted matrix containing the raw data. |
required |
classification
|
Classification
|
a formatting.Classification object. |
required |
Returns:
| Type | Description |
|---|---|
tuple[DataFrame, list[tuple[str, str]]]
|
A tuple of the validated DataFrame and a list of mappings for the columns that have been validated, to be used in renaming. |
Source code in importing/functions.py
311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 | |