data_frame
DataFrame ¶
Bases: ImmutableObject
Two dimensional PQL DataFrame.
Parameters:
-
data
(
MutableMapping[str, SeriesLike]
) –Dictionary with data to be used for data frames. Keys are column keys and values can be either Series, PQL query string, PQLColumn, or PQLOperator.
-
index
(
Optional[BaseIndex]
) –Index to be used. Default is RangeIndex.
-
filters
(
Optional[FiltersLike]
) –Filters to be used. Default is none.
-
order_by_columns
(
Optional[List[OrderByColumn]]
) –OrderByColumns to be used to sort data frame. Default is none.
-
saola_connector
(
Optional[SaolaConnector]
) –Saola connector used to export data.
order_by_columns
property
¶
Returns order by columns of data frame.
query_columns
property
¶
Returns list of PQL columns of data frame.
query_order_by_columns
property
¶
Returns order by columns of series.
ncolumns
property
¶
Returns an int representing the number of columns of this data frame.
object_str
staticmethod
¶
Returns string representation of object with given class name and properties.
Parameters:
-
class_name
(
str
) –Name of object class.
-
properties
(
OrderedDict[str, Any]
) –Properties to include.
Returns:
-
str
–String representation.
shorten_string
staticmethod
¶
Shortens string to have maximum of max_length
characters.
from_pql
classmethod
¶
head ¶
Returns the first n rows based on position as pandas DataFrame.
Parameters:
-
n
(
int
) –Number of rows to return.
Returns:
-
pd.DataFrame
–First n rows as pandas DataFrame.
add ¶
Return addition of data frame and other.
Applies ADD operator to column.
Parameters:
-
other
(
Union[DataFrame, Series, NumericValue]
) –DataFrame, Series or numeric scalar to be added.
Returns:
-
DataFrame
–The result of the operation.
sub ¶
Return subtraction of data frame and other.
Applies SUB operator to column.
Parameters:
-
other
(
Union[DataFrame, Series, NumericValue]
) –DataFrame, Series or numeric scalar to be subtracted.
Returns:
-
DataFrame
–The result of the operation.
mul ¶
Return multiplication of data frame and other.
Applies MULT operator to column.
Parameters:
-
other
(
Union[DataFrame, Series, NumericValue]
) –DataFrame, Series or numeric scalar to be multiplied.
Returns:
-
DataFrame
–The result of the operation.
div ¶
Return division of data frame and other.
Applies DIV operator to column.
Parameters:
-
other
(
Union[DataFrame, Series, NumericValue]
) –DataFrame, Series or numeric scalar to be divided.
Returns:
-
DataFrame
–The result of the operation.
floordiv ¶
Return floor division of data frame and other.
Applies FLOOR operator and DIV operator to column.
Parameters:
-
other
(
Union[DataFrame, Series, NumericValue]
) –DataFrame, Series or numeric scalar to be floor divided.
Returns:
-
DataFrame
–The result of the operation.
mod ¶
Return modulo of data frame and other.
Applies MODULO operator to column.
Parameters:
-
other
(
Union[DataFrame, Series, NumericValue]
) –DataFrame, Series or numeric scalar to be modulo'd.
Returns:
-
DataFrame
–The result of the operation.
pow ¶
Return the data frame raised to the power of other.
Applies POWER operator to column.
Parameters:
-
other
(
Union[DataFrame, Series, NumericValue]
) –DataFrame, Series or numeric scalar to be the exponent.
Returns:
-
DataFrame
–The result of the operation.
abs ¶
Return the DataFrame with the absolute value of its elements.
Applies ABS operator to column.
round ¶
Round dataframe to given number of decimals.
Applies ROUND operator to column.
lt ¶
Return a DataFrame of booleans indicating whether each element is less than the other.
Applies LOWER_THAN operator to column.
Parameters:
-
other
(
Union[DataFrame, Series, pd.Series, ScalarValue]
) –DataFrame, Series or scalar to be compared.
Returns:
-
DataFrame
–The result of the operation.
le ¶
Return a DataFrame of booleans indicating whether each element is less than or equal to the other.
Applies LOWER_EQUALS operator to column.
Parameters:
-
other
(
Union[DataFrame, Series, pd.Series, ScalarValue]
) –DataFrame, Series or scalar to be compared.
Returns:
-
DataFrame
–The result of the operation.
eq ¶
Return a DataFrame of booleans indicating whether each element is equal to the other.
Applies EQUALS operator to column.
Parameters:
-
other
(
Union[DataFrame, Series, pd.Series, ScalarValue]
) –DataFrame, Series or scalar to be compared.
Returns:
-
DataFrame
–The result of the operation.
ne ¶
Return a DataFrame of booleans indicating whether each element is not equal to the other.
Applies NOT_EQUALS operator to column.
Parameters:
-
other
(
Union[DataFrame, Series, pd.Series, ScalarValue]
) –DataFrame, Series or scalar to be compared.
Returns:
-
DataFrame
–The result of the operation.
ge ¶
Return a DataFrame of booleans indicating whether each element is greater than or equal to the other.
Applies GREATER_EQUALS operator to column.
Parameters:
-
other
(
Union[DataFrame, Series, pd.Series, ScalarValue]
) –DataFrame, Series or scalar to be compared.
Returns:
-
DataFrame
–The result of the operation.
gt ¶
Return a DataFrame of booleans indicating whether each element is greater than the other.
Applies GREATER_THAN operator to column.
Parameters:
-
other
(
Union[DataFrame, Series, pd.Series, ScalarValue]
) –DataFrame, Series or scalar to be compared.
Returns:
-
DataFrame
–The result of the operation.
isnull ¶
Return a boolean same-sized DataFrame indicating if the values are null.
Applies IS NULL operator to column.
Returns:
-
DataFrame
–A DataFrame of masked bool values for each element that indicates whether an element is a null value.
isin ¶
Returns whether elements of data frame are in values.
Applies IN operator to column.
Parameters:
-
values
(
List[Union[Series, ScalarValue]]
) –List of values to test.
Returns:
-
DataFrame
–The result of the operation.
dropna ¶
Return DataFrame with filter for null values. Rows are removed if any column is null.
Returns:
-
DataFrame
–A DataFrame with null values filtered out.
mean ¶
Return the mean of each column.
Applies AVG operator to column.
Returns:
-
pd.Series
–Mean of column values.
median ¶
Return the median of each column.
Applies MEDIAN operator to column.
Returns:
-
pd.Series
–Median of column values.
quantile ¶
Return the quantile of each column.
Applies QUANTILE operator to column.
Parameters:
-
q
(
float
) –Quantile to compute. 0 <= q <= 1.
Returns:
-
pd.Series
–Quantile of series values.
mode ¶
Return the mode of each column.
Applies MODE operator to column.
Returns:
-
pd.DataFrame
–Mode of column values.
max ¶
Return the max of each column.
Applies MAX operator to column.
Returns:
-
pd.Series
–Max of column values.
min ¶
Return the min of each column.
Applies MIN operator to column.
Returns:
-
pd.Series
–Min of column values.
sum ¶
Return the sum of each column.
Applies SUM operator to column.
Returns:
-
pd.Series
–Sum of column values.
product ¶
Return the product of each column. Null values are skipped.
Applies PRODUCT operator to column. In case of an overflow the result will be null.
Returns:
-
pd.Series
–Product of column values.
count ¶
Return the number of non-null values per column of data frame.
Applies COUNT operator to column.
Returns:
-
pd.Series
–Number of non-null values per column.
groupby ¶
Return the group by aggregation methods containing all aggregation methods.
Parameters:
-
by
(
Union[str, List[str]]
) –Used to determine the groups the aggregation method is applied on.
Returns:
-
GroupByAggregationMethods
–GroupByAggregationMethods object
var ¶
Return the variance of each column using the n-1 method. Null values are ignored.
Applies VAR operator to column.
Returns:
-
pd.Series
–Variance of column values.
std ¶
Return the standard deviation of each column using the n-1 method. Null values are ignored.
Applies STDEV operator to column.
Returns:
-
pd.Series
–Standard deviation of column values.
to_float ¶
Converts columns of given data frame to float.
Applies TO_FLOAT operator to column.
to_string ¶
Converts columns of given data frame to string.
Applies TO_STRING operator to column.
Parameters:
-
format_
(
Optional[str]
) –Optional, defines how dates are converted to string.
Returns:
-
DataFrame
–DataFrame converted to string.
to_date ¶
Converts columns of given data frame to date.
Applies TO_DATE operator to column.
Parameters:
-
format_
(
str
) –Defines how strings are converted to date.
Returns:
-
DataFrame
–DataFrame converted to date.
astype ¶
Converts columns of given data frame to type.
Parameters:
-
type_
(
Type[Union[str, int, float]]
) –Type to convert to. Supported types are str, int, float.
-
**kwargs
(
Any
) –Passed to conversion function.
Returns:
-
DataFrame
–Converted DataFrame.
nunique ¶
Returns number of unique elements per column of data frame.
Parameters:
-
dropna
(
bool
) –Whether none values are counted or not.
Returns:
-
pd.Series
–Number of unique elements per column.
drop ¶
Drop labels from columns.
Parameters:
-
labels
(
Union[str, List[str]]
) –Name of columns to drop.
Returns:
-
DataFrame
–DataFrame without given columns.
sort_values ¶
Sorts data frame by given columns.
Parameters:
-
by
(
Union[str, List[str]]
) –Name or list of names of columns to sort by.
-
ascending
(
Union[bool, List[bool]]
) –Sort ascending or descending. Specify list for multiple sort orders.
Returns:
-
DataFrame
–DataFrame with OrderByColumns set.
apply_unary_operator ¶
Applies given unary operator to data frame.
Parameters:
-
operator
(
Type[UnaryPQLOperator]
) –Operator to apply.
Returns:
-
DataFrame
–DataFrame with operator applied.
apply_binary_operator ¶
Applies given binary operator to data frame and exports result.
Parameters:
-
other
(
Union[DataFrame, Series, pd.Series, ScalarValue]
) –Other operand to apply binary operator on.
-
operator
(
Type[BinaryPQLOperator]
) –Operator to apply.
-
reverse
(
bool
) –If true order of operands is reversed.
Returns:
-
DataFrame
–DataFrame with operator applied.
apply_binary_operator_dunder ¶
Combines data frame with other by applying function for each column for dunder methods.
apply_aggregation_operator ¶
Applies given aggregation operator to data frame and exports result.
Parameters:
-
operator
(
Type[UnaryPQLOperator]
) –Operator to apply.
Returns:
-
pd.Series
–Series with operator applied.
copy ¶
Copies given data frame and overrides properties given as parameters.
verify_columns_contained ¶
Verifies whether the dataframe contains columns.
Parameters:
-
columns
(
List[str]
) –List of columns to verify
Returns:
-
typing.Set[str]
–Set of verified column names