Data Handling

Load match data

pyTSPA.data.load_match_data(filepath: str) DataFrame[source]

Loads match data from a CSV or Excel file into a DataFrame.

Parameters:

filepath (str) – path to the CSV (.csv) or Excel (.xlsx, .xls) file

Returns:

loaded match data

Return type:

pd.DataFrame

Raises:

ValueError – if file extension is unsupported or loading fails

Clean match data

pyTSPA.data.clean_data(df: DataFrame, missing_strategy: Literal['fill', 'drop', 'none'] = 'fill') DataFrame[source]

Cleans match data: handles missing values and converts date columns.

Parameters:
  • df (pd.DataFrame) – raw data to be cleaned

  • missing_strategy (str) – strategy for handling missing values - “fill”: fill numeric missing values with column mean (default) - “drop”: drop rows with any missing value in at least one variable - “none”: leave missing values untouched

Returns:

cleaned DataFrame

Return type:

pd.DataFrame

Raises:

ValueError – if an unknown missing_strategy is given

Data profiling

pyTSPA.data.data_profiling(df: DataFrame) None[source]

Prints basic information about the DataFrame: column names, types, number of rows and columns, missing values and basic statistics.

Parameters:

df (pd.DataFrame) – the DataFrame to analyze

Returns:

the function only prints information to the console

Return type:

None