Skip to content

FlickerColumn

_column instance-attribute

_column = column

_df instance-attribute

_df = df

_dtype instance-attribute

_dtype = dtypes[0][1]

dtype property

dtype

__add__

__add__(other)

__and__

__and__(other)

__bool__

__bool__()

__call__

__call__(n=5, use_pandas_dtypes=False)

__div__

__div__(other)

__eq__

__eq__(other)

__ge__

__ge__(other)

__gt__

__gt__(other)

__invert__

__invert__()

__le__

__le__(other)

__lt__

__lt__(other)

__mod__

__mod__(other)

__mul__

__mul__(other)

__ne__

__ne__(other)

__neg__

__neg__()

__or__

__or__(other)

__pow__

__pow__(other)

__radd__

__radd__(other)

__rand__

__rand__(other)

__rdiv__

__rdiv__(other)

__repr__

__repr__()

__rmod__

__rmod__(other)

__rmul__

__rmul__(other)

__ror__

__ror__(other)

__rpow__

__rpow__(other)

__rsub__

__rsub__(other)

__rtruediv__

__rtruediv__(other)

__str__

__str__()

__sub__

__sub__(other)

__truediv__

__truediv__(other)

_ensure_boolean

_ensure_boolean()

_ensure_float

_ensure_float()

_get_non_nan_dataframe

_get_non_nan_dataframe(ignore_nan)

all

all(ignore_null=False)

any

any()

apply

apply(udf)

astype

astype(type_)

Cast the column to a particular dtype.

Parameters:

Name Type Description Default
type_ type or str or DataType

The target data type for the column. If type_ is a str, then it must be the string-valued name of a spark dtype such as "int", "bigint", "float", "double", "string", "timestamp", "boolean", "tinyint" (ByteType) or more. If type_ is a python type, then it can be one of the keys of flicker.PYTHON_TO_SPARK_DTYPES. If type_ is pyspark.sql.types.DataType, then it can be any of the types in pyspark.sql.types.*.

required
See
required

Returns:

Type Description
FlickerColumn

A new FlickerColumn instance with the column cast to the specified data type.

describe

describe()

is_nan

is_nan()

is_not_null

is_not_null()

is_null

is_null()

isin

isin(values)

max

max(ignore_nan=True)

mean

mean(ignore_nan=True)

min

min(ignore_nan=True)

stddev

stddev(ignore_nan=True)

take

take(n=5)

value_counts

value_counts(sort=True, ascending=False, drop_null=False, normalize=False, n=None)