Module: SparkConnect::Functions

Extended by:
Functions
Included in:
Functions
Defined in:
lib/spark_connect/functions.rb

Overview

The standard Spark SQL function library, mirroring PySpark's pyspark.sql.functions. Every function returns a Column.

Available both as SparkConnect::Functions and the shorthand SparkConnect::F. All methods are module functions.

Following PySpark's convention, a String argument denotes a column name for most functions (e.g. F.sum("salary") aggregates the salary column), while functions whose parameters are genuinely literal (regex patterns, date formats, JSON paths, ...) treat their String arguments as literal values.

Examples:

F = SparkConnect::F
F.col("a") + F.lit(1)
F.when(F.col("x") > 0, "pos").otherwise("non-pos")
F.sum("amount").alias("total")

Constant Summary collapse

Proto =
SparkConnect::Proto
UNIFORM =

The following functions are generated programmatically below (UNIFORM and NO_ARG). The @!method directives document them so they appear in the API reference; each returns a Column.

---- Generated uniform functions -------------------------------------- Functions whose arguments are all ColumnOrName (a String denotes a column name). Defined programmatically to keep the surface complete and compact.

%w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze
NO_ARG =

No-argument functions.

%w[
  current_date current_timestamp now current_timezone current_user current_catalog
  current_database current_schema monotonically_increasing_id spark_partition_id
  input_file_name input_file_block_start input_file_block_length version uuid
  row_number rank dense_rank percent_rank cume_dist
].freeze

Class Attribute Summary collapse

Instance Method Summary collapse

Class Attribute Details

.lambda_counterObject

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



880
881
882
# File 'lib/spark_connect/functions.rb', line 880

def lambda_counter
  @lambda_counter
end

Instance Method Details

#_col(value) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

ColumnOrName coercion: String/Symbol -> column reference, Column -> itself, everything else -> literal.



863
864
865
866
867
868
869
# File 'lib/spark_connect/functions.rb', line 863

def _col(value)
  case value
  when Column then value
  when String, Symbol then col(value.to_s)
  else lit(value)
  end
end

#_lambda(block) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Build a Column wrapping a LambdaFunction from a Ruby block. The block is called with one or more lambda-variable columns and must return a Column.



886
887
888
889
890
891
892
893
894
895
896
897
898
# File 'lib/spark_connect/functions.rb', line 886

def _lambda(block)
  arity = block.arity.negative? ? 1 : [block.arity, 1].max
  Functions.lambda_counter += 1
  names = (0...arity).map { |i| "x_#{Functions.lambda_counter}_#{i}" }
  vars = names.map do |n|
    Proto::Expression::UnresolvedNamedLambdaVariable.new(name_parts: [n])
  end
  cols = vars.map { |v| Column.new(Proto::Expression.new(unresolved_named_lambda_variable: v)) }
  body = block.call(*cols)
  Column.new(Proto::Expression.new(
               lambda_function: Proto::Expression::LambdaFunction.new(function: body.to_expr, arguments: vars)
             ))
end

#_lit_or_col(value) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



872
873
874
# File 'lib/spark_connect/functions.rb', line 872

def _lit_or_col(value)
  value.is_a?(Column) ? value : lit(value)
end

#abs(*cols) ⇒ Column

The Spark SQL abs function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#acos(*cols) ⇒ Column

The Spark SQL acos function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#acosh(*cols) ⇒ Column

The Spark SQL acosh function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#add_months(col, months) ⇒ Object



159
# File 'lib/spark_connect/functions.rb', line 159

def add_months(col, months) = Column.invoke("add_months", _col(col), lit(months))

#aggregate(col, initial, merge, finish = nil) ⇒ Column

Aggregate (fold) an array. merge combines accumulator and element; optional finish post-processes the result.

Returns:



258
259
260
261
262
# File 'lib/spark_connect/functions.rb', line 258

def aggregate(col, initial, merge, finish = nil)
  args = [_col(col), _col(initial), _lambda(merge)]
  args << _lambda(finish) if finish
  Column.invoke("aggregate", *args)
end

#any_value(*cols) ⇒ Column

The Spark SQL any_value function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#approx_count_distinct(col, rsd = nil) ⇒ Column

Returns approximate distinct count (optionally with relative SD).

Returns:

  • (Column)

    approximate distinct count (optionally with relative SD).



70
71
72
# File 'lib/spark_connect/functions.rb', line 70

def approx_count_distinct(col, rsd = nil)
  rsd.nil? ? Column.invoke("approx_count_distinct", _col(col)) : Column.invoke("approx_count_distinct", _col(col), lit(rsd))
end

#array(*cols) ⇒ Column

Returns an array from the given columns.

Returns:

  • (Column)

    an array from the given columns.



96
97
# File 'lib/spark_connect/functions.rb', line 96

def array(*cols) = Column.invoke("array", *cols.map { |c| _col(c) })
# @return [Column] a map from alternating key/value columns.

#array_append(col, value) ⇒ Object



201
# File 'lib/spark_connect/functions.rb', line 201

def array_append(col, value) = Column.invoke("array_append", _col(col), lit(value))

#array_compact(*cols) ⇒ Column

The Spark SQL array_compact function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_contains(col, value) ⇒ Object

---- Array / map functions with value arguments -----------------------



197
# File 'lib/spark_connect/functions.rb', line 197

def array_contains(col, value) = Column.invoke("array_contains", _col(col), lit(value))

#array_distinct(*cols) ⇒ Column

The Spark SQL array_distinct function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_except(*cols) ⇒ Column

The Spark SQL array_except function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_insert(col, pos, value) ⇒ Object



203
# File 'lib/spark_connect/functions.rb', line 203

def array_insert(col, pos, value) = Column.invoke("array_insert", _col(col), lit(pos), lit(value))

#array_intersect(*cols) ⇒ Column

The Spark SQL array_intersect function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_join(col, delimiter, null_replacement = nil) ⇒ Object



205
206
207
208
209
210
211
212
# File 'lib/spark_connect/functions.rb', line 205

def array_join(col, delimiter, null_replacement = nil)
  if null_replacement.nil?
    Column.invoke("array_join", _col(col),
                  lit(delimiter))
  else
    Column.invoke("array_join", _col(col), lit(delimiter), lit(null_replacement))
  end
end

#array_max(*cols) ⇒ Column

The Spark SQL array_max function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_min(*cols) ⇒ Column

The Spark SQL array_min function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_position(col, value) ⇒ Object



198
# File 'lib/spark_connect/functions.rb', line 198

def array_position(col, value) = Column.invoke("array_position", _col(col), lit(value))

#array_prepend(col, value) ⇒ Object



202
# File 'lib/spark_connect/functions.rb', line 202

def array_prepend(col, value) = Column.invoke("array_prepend", _col(col), lit(value))

#array_remove(col, element) ⇒ Object



199
# File 'lib/spark_connect/functions.rb', line 199

def array_remove(col, element) = Column.invoke("array_remove", _col(col), lit(element))

#array_repeat(col, count) ⇒ Object



200
# File 'lib/spark_connect/functions.rb', line 200

def array_repeat(col, count) = Column.invoke("array_repeat", _col(col), lit(count))

#array_sort(*cols) ⇒ Column

The Spark SQL array_sort function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_union(*cols) ⇒ Column

The Spark SQL array_union function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#arrays_overlap(*cols) ⇒ Column

The Spark SQL arrays_overlap function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#arrays_zip(*cols) ⇒ Column

The Spark SQL arrays_zip function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#asc(col) ⇒ Column

Returns an ascending sort order for the named/given column.

Returns:

  • (Column)

    an ascending sort order for the named/given column.



42
# File 'lib/spark_connect/functions.rb', line 42

def asc(col) = _col(col).asc

#asc_nulls_first(col) ⇒ Object



44
# File 'lib/spark_connect/functions.rb', line 44

def asc_nulls_first(col) = _col(col).asc_nulls_first

#asc_nulls_last(col) ⇒ Object



45
# File 'lib/spark_connect/functions.rb', line 45

def asc_nulls_last(col) = _col(col).asc_nulls_last

#ascii(*cols) ⇒ Column

The Spark SQL ascii function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#asin(*cols) ⇒ Column

The Spark SQL asin function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#asinh(*cols) ⇒ Column

The Spark SQL asinh function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#atan(*cols) ⇒ Column

The Spark SQL atan function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#atan2(*cols) ⇒ Column

The Spark SQL atan2 function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#atanh(*cols) ⇒ Column

The Spark SQL atanh function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#avg(*cols) ⇒ Column

The Spark SQL avg function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#base64(*cols) ⇒ Column

The Spark SQL base64 function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bin(*cols) ⇒ Column

The Spark SQL bin function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_and(*cols) ⇒ Column

The Spark SQL bit_and function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_count(*cols) ⇒ Column

The Spark SQL bit_count function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_length(*cols) ⇒ Column

The Spark SQL bit_length function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_or(*cols) ⇒ Column

The Spark SQL bit_or function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_xor(*cols) ⇒ Column

The Spark SQL bit_xor function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bitwise_not(*cols) ⇒ Column

The Spark SQL bitwise_not function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bool_and(*cols) ⇒ Column

The Spark SQL bool_and function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bool_or(*cols) ⇒ Column

The Spark SQL bool_or function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#broadcast(df) ⇒ DataFrame

Mark a DataFrame for broadcast (map-side) join.

Parameters:

Returns:



269
# File 'lib/spark_connect/functions.rb', line 269

def broadcast(df) = df.hint("broadcast")

#bround(col, scale = 0) ⇒ Column

Returns HALF_EVEN ("banker's") rounding to scale places.

Returns:

  • (Column)

    HALF_EVEN ("banker's") rounding to scale places.



82
# File 'lib/spark_connect/functions.rb', line 82

def bround(col, scale = 0) = Column.invoke("bround", _col(col), lit(scale))

#cardinality(*cols) ⇒ Column

The Spark SQL cardinality function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cbrt(*cols) ⇒ Column

The Spark SQL cbrt function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#ceil(*cols) ⇒ Column

The Spark SQL ceil function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#ceiling(*cols) ⇒ Column

The Spark SQL ceiling function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#char_length(*cols) ⇒ Column

The Spark SQL char_length function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#character_length(*cols) ⇒ Column

The Spark SQL character_length function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#coalesce(*cols) ⇒ Column

Returns first non-null among the given columns.

Returns:

  • (Column)

    first non-null among the given columns.



87
88
# File 'lib/spark_connect/functions.rb', line 87

def coalesce(*cols) = Column.invoke("coalesce", *cols.map { |c| _col(c) })
# @return [Column] `value` if `col` is NaN else `col`.

#col(name) ⇒ Column Also known as: column

A column reference by name. "*" selects all columns.

Returns:



28
# File 'lib/spark_connect/functions.rb', line 28

def col(name) = Column.from_name(name.to_s)

#collect_list(*cols) ⇒ Column

The Spark SQL collect_list function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#collect_set(*cols) ⇒ Column

The Spark SQL collect_set function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#concat(*cols) ⇒ Column

The Spark SQL concat function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#concat_ws(sep, *cols) ⇒ Column

Returns concatenation of columns separated by literal sep.

Returns:

  • (Column)

    concatenation of columns separated by literal sep.



107
108
# File 'lib/spark_connect/functions.rb', line 107

def concat_ws(sep, *cols) = Column.invoke("concat_ws", lit(sep), *cols.map { |c| _col(c) })
# @return [Column] printf-style formatting using literal `fmt`.

#conv(col, from_base, to_base) ⇒ Column

Returns convert a number string from from_base to to_base.

Returns:

  • (Column)

    convert a number string from from_base to to_base.



145
146
# File 'lib/spark_connect/functions.rb', line 145

def conv(col, from_base, to_base) = Column.invoke("conv", _col(col), lit(from_base), lit(to_base))
# @return [Column] left shift / right shift by literal bit counts.

#corr(*cols) ⇒ Column

The Spark SQL corr function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cos(*cols) ⇒ Column

The Spark SQL cos function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cosh(*cols) ⇒ Column

The Spark SQL cosh function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cot(*cols) ⇒ Column

The Spark SQL cot function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#count(col) ⇒ Column

Returns count of rows (or non-null values of a column). "*" counts all rows.

Returns:

  • (Column)

    count of rows (or non-null values of a column). "*" counts all rows.



59
60
61
# File 'lib/spark_connect/functions.rb', line 59

def count(col)
  col.to_s == "*" ? Column.invoke("count", lit(1)) : Column.invoke("count", _col(col))
end

#count_distinct(*cols) ⇒ Column Also known as: countDistinct

Returns count of distinct combinations of the given columns.

Returns:

  • (Column)

    count of distinct combinations of the given columns.



64
65
66
# File 'lib/spark_connect/functions.rb', line 64

def count_distinct(*cols)
  Column.invoke("count", *cols.map { |c| _col(c) }, is_distinct: true)
end

#count_if(*cols) ⇒ Column

The Spark SQL count_if function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#covar_pop(*cols) ⇒ Column

The Spark SQL covar_pop function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#covar_samp(*cols) ⇒ Column

The Spark SQL covar_samp function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#crc32(*cols) ⇒ Column

The Spark SQL crc32 function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#create_map(*cols) ⇒ Column

Returns a map from alternating key/value columns.

Returns:

  • (Column)

    a map from alternating key/value columns.



98
99
# File 'lib/spark_connect/functions.rb', line 98

def create_map(*cols) = Column.invoke("map", *cols.map { |c| _col(c) })
# @return [Column] a map from two array columns (keys, values).

#csc(*cols) ⇒ Column

The Spark SQL csc function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cume_distColumn

The Spark SQL cume_dist function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_catalogColumn

The Spark SQL current_catalog function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_databaseColumn

The Spark SQL current_database function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_dateColumn

The Spark SQL current_date function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_schemaColumn

The Spark SQL current_schema function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_timestampColumn

The Spark SQL current_timestamp function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_timezoneColumn

The Spark SQL current_timezone function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_userColumn

The Spark SQL current_user function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#date_add(col, days) ⇒ Object



156
# File 'lib/spark_connect/functions.rb', line 156

def date_add(col, days) = Column.invoke("date_add", _col(col), lit(days))

#date_format(col, fmt) ⇒ Object

---- Date / time functions with literal arguments ---------------------



153
# File 'lib/spark_connect/functions.rb', line 153

def date_format(col, fmt) = Column.invoke("date_format", _col(col), lit(fmt))

#date_from_unix_date(*cols) ⇒ Column

The Spark SQL date_from_unix_date function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#date_sub(col, days) ⇒ Object



157
# File 'lib/spark_connect/functions.rb', line 157

def date_sub(col, days) = Column.invoke("date_sub", _col(col), lit(days))

#date_trunc(fmt, col) ⇒ Object



163
# File 'lib/spark_connect/functions.rb', line 163

def date_trunc(fmt, col) = Column.invoke("date_trunc", lit(fmt), _col(col))

#datediff(end_col, start_col) ⇒ Object



158
# File 'lib/spark_connect/functions.rb', line 158

def datediff(end_col, start_col) = Column.invoke("datediff", _col(end_col), _col(start_col))

#day(*cols) ⇒ Column

The Spark SQL day function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#dayofmonth(*cols) ⇒ Column

The Spark SQL dayofmonth function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#dayofweek(*cols) ⇒ Column

The Spark SQL dayofweek function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#dayofyear(*cols) ⇒ Column

The Spark SQL dayofyear function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#degrees(*cols) ⇒ Column

The Spark SQL degrees function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#dense_rankColumn

The Spark SQL dense_rank function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#desc(col) ⇒ Object



43
# File 'lib/spark_connect/functions.rb', line 43

def desc(col) = _col(col).desc

#desc_nulls_first(col) ⇒ Object



46
# File 'lib/spark_connect/functions.rb', line 46

def desc_nulls_first(col) = _col(col).desc_nulls_first

#desc_nulls_last(col) ⇒ Object



47
# File 'lib/spark_connect/functions.rb', line 47

def desc_nulls_last(col) = _col(col).desc_nulls_last

#element_at(col, extraction) ⇒ Object



214
# File 'lib/spark_connect/functions.rb', line 214

def element_at(col, extraction) = Column.invoke("element_at", _col(col), lit(extraction))

#every(*cols) ⇒ Column

The Spark SQL every function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#exists(col, &block) ⇒ Object



246
# File 'lib/spark_connect/functions.rb', line 246

def exists(col, &block) = Column.invoke("exists", _col(col), _lambda(block))

#exp(*cols) ⇒ Column

The Spark SQL exp function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#explode(*cols) ⇒ Column

The Spark SQL explode function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#explode_outer(*cols) ⇒ Column

The Spark SQL explode_outer function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#expm1(*cols) ⇒ Column

The Spark SQL expm1 function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#expr(sql) ⇒ Column

Parse a SQL expression string into a Column.

Returns:



37
38
39
# File 'lib/spark_connect/functions.rb', line 37

def expr(sql)
  Column.from_expr(Proto::Expression.new(expression_string: Proto::Expression::ExpressionString.new(expression: sql)))
end

#factorial(*cols) ⇒ Column

The Spark SQL factorial function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#filter(col, &block) ⇒ Object



248
# File 'lib/spark_connect/functions.rb', line 248

def filter(col, &block) = Column.invoke("filter", _col(col), _lambda(block))

#first(*cols) ⇒ Column

The Spark SQL first function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#first_value(*cols) ⇒ Column

The Spark SQL first_value function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#flatten(*cols) ⇒ Column

The Spark SQL flatten function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#floor(*cols) ⇒ Column

The Spark SQL floor function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#forall(col, &block) ⇒ Object



247
# File 'lib/spark_connect/functions.rb', line 247

def forall(col, &block) = Column.invoke("forall", _col(col), _lambda(block))

#format_number(col, d) ⇒ Column

Returns number formatted to d decimal places.

Returns:

  • (Column)

    number formatted to d decimal places.



111
112
# File 'lib/spark_connect/functions.rb', line 111

def format_number(col, d) = Column.invoke("format_number", _col(col), lit(d))
# @return [Column] substring of length `len` from 1-based `pos`.

#format_string(fmt, *cols) ⇒ Column

Returns printf-style formatting using literal fmt.

Returns:

  • (Column)

    printf-style formatting using literal fmt.



109
110
# File 'lib/spark_connect/functions.rb', line 109

def format_string(fmt, *cols) = Column.invoke("format_string", lit(fmt), *cols.map { |c| _col(c) })
# @return [Column] number formatted to `d` decimal places.

#from_json(col, schema, options = {}) ⇒ Object

Parameters:



180
181
182
183
184
# File 'lib/spark_connect/functions.rb', line 180

def from_json(col, schema, options = {})
  schema_col = schema.is_a?(Types::DataType) ? lit(schema.json) : lit(schema.to_s)
  args = [_col(col), schema_col] + options.flat_map { |k, v| [lit(k.to_s), lit(v.to_s)] }
  Column.invoke("from_json", *args)
end

#from_unixtime(col, fmt = "yyyy-MM-dd HH:mm:ss") ⇒ Object



164
# File 'lib/spark_connect/functions.rb', line 164

def from_unixtime(col, fmt = "yyyy-MM-dd HH:mm:ss") = Column.invoke("from_unixtime", _col(col), lit(fmt))

#from_utc_timestamp(col, tz) ⇒ Object



170
# File 'lib/spark_connect/functions.rb', line 170

def from_utc_timestamp(col, tz) = Column.invoke("from_utc_timestamp", _col(col), lit(tz))

#get_json_object(col, path) ⇒ Object

---- JSON / CSV --------------------------------------------------------



176
# File 'lib/spark_connect/functions.rb', line 176

def get_json_object(col, path) = Column.invoke("get_json_object", _col(col), lit(path))

#greatest(*cols) ⇒ Column

The Spark SQL greatest function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#grouping(*cols) ⇒ Column

The Spark SQL grouping function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#hash(*cols) ⇒ Column

The Spark SQL hash function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#hex(*cols) ⇒ Column

The Spark SQL hex function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#hour(*cols) ⇒ Column

The Spark SQL hour function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#hypot(*cols) ⇒ Column

The Spark SQL hypot function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#initcap(*cols) ⇒ Column

The Spark SQL initcap function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#inline(*cols) ⇒ Column

The Spark SQL inline function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#inline_outer(*cols) ⇒ Column

The Spark SQL inline_outer function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#input_file_block_lengthColumn

The Spark SQL input_file_block_length function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#input_file_block_startColumn

The Spark SQL input_file_block_start function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#input_file_nameColumn

The Spark SQL input_file_name function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#instr(col, substr) ⇒ Column

Returns 1-based position of literal substr within col (0 if absent).

Returns:

  • (Column)

    1-based position of literal substr within col (0 if absent).



117
118
# File 'lib/spark_connect/functions.rb', line 117

def instr(col, substr) = Column.invoke("instr", _col(col), lit(substr))
# @return [Column] 1-based position of `substr` in `col` at/after `pos`.

#isnan(*cols) ⇒ Column

The Spark SQL isnan function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#isnull(*cols) ⇒ Column

The Spark SQL isnull function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#json_tuple(col, *fields) ⇒ Object



177
# File 'lib/spark_connect/functions.rb', line 177

def json_tuple(col, *fields) = Column.invoke("json_tuple", _col(col), *fields.map { |f| lit(f) })

#kurtosis(*cols) ⇒ Column

The Spark SQL kurtosis function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lag(col, offset = 1, default = nil) ⇒ Object

---- Window / analytic functions --------------------------------------



225
# File 'lib/spark_connect/functions.rb', line 225

def lag(col, offset = 1, default = nil) = Column.invoke("lag", _col(col), lit(offset), lit(default))

#last(*cols) ⇒ Column

The Spark SQL last function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#last_day(*cols) ⇒ Column

The Spark SQL last_day function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#last_value(*cols) ⇒ Column

The Spark SQL last_value function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lcase(*cols) ⇒ Column

The Spark SQL lcase function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lead(col, offset = 1, default = nil) ⇒ Object



226
# File 'lib/spark_connect/functions.rb', line 226

def lead(col, offset = 1, default = nil) = Column.invoke("lead", _col(col), lit(offset), lit(default))

#least(*cols) ⇒ Column

The Spark SQL least function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#length(*cols) ⇒ Column

The Spark SQL length function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lit(value) ⇒ Column

A literal value column. See Column.lit for supported Ruby types.

Returns:



33
# File 'lib/spark_connect/functions.rb', line 33

def lit(value) = Column.lit(value)

#ln(*cols) ⇒ Column

The Spark SQL ln function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#locate(substr, col, pos = 1) ⇒ Column

Returns 1-based position of substr in col at/after pos.

Returns:

  • (Column)

    1-based position of substr in col at/after pos.



119
120
# File 'lib/spark_connect/functions.rb', line 119

def locate(substr, col, pos = 1) = Column.invoke("locate", lit(substr), _col(col), lit(pos))
# @return [Column] left-padded string.

#log(*cols) ⇒ Column

The Spark SQL log function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#log10(*cols) ⇒ Column

The Spark SQL log10 function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#log1p(*cols) ⇒ Column

The Spark SQL log1p function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#log2(*cols) ⇒ Column

The Spark SQL log2 function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lower(*cols) ⇒ Column

The Spark SQL lower function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lpad(col, len, pad) ⇒ Column

Returns left-padded string.

Returns:

  • (Column)

    left-padded string.



121
122
# File 'lib/spark_connect/functions.rb', line 121

def lpad(col, len, pad) = Column.invoke("lpad", _col(col), lit(len), lit(pad))
# @return [Column] right-padded string.

#ltrim(*cols) ⇒ Column

The Spark SQL ltrim function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#make_date(year, month, day) ⇒ Object



172
# File 'lib/spark_connect/functions.rb', line 172

def make_date(year, month, day) = Column.invoke("make_date", _col(year), _col(month), _col(day))

#map_concat(*cols) ⇒ Column

The Spark SQL map_concat function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_contains_key(col, key) ⇒ Object



221
# File 'lib/spark_connect/functions.rb', line 221

def map_contains_key(col, key) = Column.invoke("map_contains_key", _col(col), lit(key))

#map_entries(*cols) ⇒ Column

The Spark SQL map_entries function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_filter(col, &block) ⇒ Object



252
# File 'lib/spark_connect/functions.rb', line 252

def map_filter(col, &block) = Column.invoke("map_filter", _col(col), _lambda(block))

#map_from_arrays(keys, values) ⇒ Column

Returns a map from two array columns (keys, values).

Returns:

  • (Column)

    a map from two array columns (keys, values).



100
101
# File 'lib/spark_connect/functions.rb', line 100

def map_from_arrays(keys, values) = Column.invoke("map_from_arrays", _col(keys), _col(values))
# @return [Column] a named struct from alternating name/value arguments.

#map_from_entries(*cols) ⇒ Column

The Spark SQL map_from_entries function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_keys(*cols) ⇒ Column

The Spark SQL map_keys function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_values(*cols) ⇒ Column

The Spark SQL map_values function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_zip_with(c1, c2, &block) ⇒ Object



253
# File 'lib/spark_connect/functions.rb', line 253

def map_zip_with(c1, c2, &block) = Column.invoke("map_zip_with", _col(c1), _col(c2), _lambda(block))

#max(*cols) ⇒ Column

The Spark SQL max function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#max_by(*cols) ⇒ Column

The Spark SQL max_by function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#md5(*cols) ⇒ Column

The Spark SQL md5 function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#mean(*cols) ⇒ Column

The Spark SQL mean function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#median(*cols) ⇒ Column

The Spark SQL median function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#min(*cols) ⇒ Column

The Spark SQL min function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#min_by(*cols) ⇒ Column

The Spark SQL min_by function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#minute(*cols) ⇒ Column

The Spark SQL minute function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#mode(*cols) ⇒ Column

The Spark SQL mode function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#monotonically_increasing_idColumn

The Spark SQL monotonically_increasing_id function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#month(*cols) ⇒ Column

The Spark SQL month function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#months_between(d1, d2, round_off = true) ⇒ Object



160
# File 'lib/spark_connect/functions.rb', line 160

def months_between(d1, d2, round_off = true) = Column.invoke("months_between", _col(d1), _col(d2), lit(round_off))

#named_struct(*cols) ⇒ Column

Returns a named struct from alternating name/value arguments.

Returns:

  • (Column)

    a named struct from alternating name/value arguments.



102
# File 'lib/spark_connect/functions.rb', line 102

def named_struct(*cols) = Column.invoke("named_struct", *cols.map { |c| _col(c) })

#nanvl(col1, col2) ⇒ Column

Returns value if col is NaN else col.

Returns:

  • (Column)

    value if col is NaN else col.



89
# File 'lib/spark_connect/functions.rb', line 89

def nanvl(col1, col2) = Column.invoke("nanvl", _col(col1), _col(col2))

#negate(*cols) ⇒ Column

The Spark SQL negate function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#negative(*cols) ⇒ Column

The Spark SQL negative function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#next_day(col, day_of_week) ⇒ Object



161
# File 'lib/spark_connect/functions.rb', line 161

def next_day(col, day_of_week) = Column.invoke("next_day", _col(col), lit(day_of_week))

#nowColumn

The Spark SQL now function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#nth_value(col, offset, ignore_nulls = false) ⇒ Object



228
# File 'lib/spark_connect/functions.rb', line 228

def nth_value(col, offset, ignore_nulls = false) = Column.invoke("nth_value", _col(col), lit(offset), lit(ignore_nulls))

#ntile(n) ⇒ Object



227
# File 'lib/spark_connect/functions.rb', line 227

def ntile(n) = Column.invoke("ntile", lit(n))

#octet_length(*cols) ⇒ Column

The Spark SQL octet_length function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#overlay(col, replace, pos, len = -1)) ⇒ Column

Returns overlay replace into col at pos for len chars.

Returns:

  • (Column)

    overlay replace into col at pos for len chars.



141
142
# File 'lib/spark_connect/functions.rb', line 141

def overlay(col, replace, pos, len = -1) = Column.invoke("overlay", _col(col), _col(replace), lit(pos), lit(len))
# @return [Column] SHA-2 hash with the given bit length (224/256/384/512).

#percent_rankColumn

The Spark SQL percent_rank function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#pmod(*cols) ⇒ Column

The Spark SQL pmod function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#posexplode(*cols) ⇒ Column

The Spark SQL posexplode function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#posexplode_outer(*cols) ⇒ Column

The Spark SQL posexplode_outer function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#positive(*cols) ⇒ Column

The Spark SQL positive function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#pow(*cols) ⇒ Column

The Spark SQL pow function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#power(*cols) ⇒ Column

The Spark SQL power function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#product(*cols) ⇒ Column

The Spark SQL product function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#quarter(*cols) ⇒ Column

The Spark SQL quarter function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#radians(*cols) ⇒ Column

The Spark SQL radians function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#rand(seed = nil) ⇒ Object

---- Randomness --------------------------------------------------------



236
# File 'lib/spark_connect/functions.rb', line 236

def rand(seed = nil) = seed.nil? ? Column.invoke("rand") : Column.invoke("rand", lit(seed))

#randn(seed = nil) ⇒ Object



237
# File 'lib/spark_connect/functions.rb', line 237

def randn(seed = nil) = seed.nil? ? Column.invoke("randn") : Column.invoke("randn", lit(seed))

#rankColumn

The Spark SQL rank function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#regexp_count(col, pattern) ⇒ Object



138
# File 'lib/spark_connect/functions.rb', line 138

def regexp_count(col, pattern) = Column.invoke("regexp_count", _col(col), lit(pattern))

#regexp_extract(col, pattern, idx = 0) ⇒ Column

Returns the idx-th group of pattern matched in col.

Returns:

  • (Column)

    the idx-th group of pattern matched in col.



131
132
# File 'lib/spark_connect/functions.rb', line 131

def regexp_extract(col, pattern, idx = 0) = Column.invoke("regexp_extract", _col(col), lit(pattern), lit(idx))
# @return [Column] all matches of group `idx` of `pattern`.

#regexp_extract_all(col, pattern, idx = 1) ⇒ Column

Returns all matches of group idx of pattern.

Returns:

  • (Column)

    all matches of group idx of pattern.



133
134
# File 'lib/spark_connect/functions.rb', line 133

def regexp_extract_all(col, pattern, idx = 1) = Column.invoke("regexp_extract_all", _col(col), lit(pattern), lit(idx))
# @return [Column] `col` with `pattern` replaced by `replacement`.

#regexp_like(col, pattern) ⇒ Column

Returns whether col matches pattern.

Returns:

  • (Column)

    whether col matches pattern.



137
# File 'lib/spark_connect/functions.rb', line 137

def regexp_like(col, pattern) = Column.invoke("regexp_like", _col(col), lit(pattern))

#regexp_replace(col, pattern, replacement) ⇒ Column

Returns col with pattern replaced by replacement.

Returns:

  • (Column)

    col with pattern replaced by replacement.



135
136
# File 'lib/spark_connect/functions.rb', line 135

def regexp_replace(col, pattern, replacement) = Column.invoke("regexp_replace", _col(col), lit(pattern), lit(replacement))
# @return [Column] whether `col` matches `pattern`.

#regexp_substr(col, pattern) ⇒ Object



139
140
# File 'lib/spark_connect/functions.rb', line 139

def regexp_substr(col, pattern) = Column.invoke("regexp_substr", _col(col), lit(pattern))
# @return [Column] overlay `replace` into `col` at `pos` for `len` chars.

#repeat(col, n) ⇒ Column

Returns the string repeated n times.

Returns:

  • (Column)

    the string repeated n times.



125
126
# File 'lib/spark_connect/functions.rb', line 125

def repeat(col, n) = Column.invoke("repeat", _col(col), lit(n))
# @return [Column] split `col` by the literal regex `pattern`.

#reverse(*cols) ⇒ Column

The Spark SQL reverse function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#rint(*cols) ⇒ Column

The Spark SQL rint function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#round(col, scale = 0) ⇒ Column

Returns HALF_UP rounding to scale decimal places.

Returns:

  • (Column)

    HALF_UP rounding to scale decimal places.



80
81
# File 'lib/spark_connect/functions.rb', line 80

def round(col, scale = 0) = Column.invoke("round", _col(col), lit(scale))
# @return [Column] HALF_EVEN ("banker's") rounding to `scale` places.

#row_numberColumn

The Spark SQL row_number function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#rpad(col, len, pad) ⇒ Column

Returns right-padded string.

Returns:

  • (Column)

    right-padded string.



123
124
# File 'lib/spark_connect/functions.rb', line 123

def rpad(col, len, pad) = Column.invoke("rpad", _col(col), lit(len), lit(pad))
# @return [Column] the string repeated `n` times.

#rtrim(*cols) ⇒ Column

The Spark SQL rtrim function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#schema_of_json(json, options = {}) ⇒ Object



191
192
193
# File 'lib/spark_connect/functions.rb', line 191

def schema_of_json(json, options = {})
  Column.invoke("schema_of_json", _lit_or_col(json), *options.flat_map { |k, v| [lit(k.to_s), lit(v.to_s)] })
end

#sec(*cols) ⇒ Column

The Spark SQL sec function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#second(*cols) ⇒ Column

The Spark SQL second function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sequence(start, stop, step = nil) ⇒ Object



217
218
219
# File 'lib/spark_connect/functions.rb', line 217

def sequence(start, stop, step = nil)
  step.nil? ? Column.invoke("sequence", _col(start), _col(stop)) : Column.invoke("sequence", _col(start), _col(stop), _col(step))
end

#sha(*cols) ⇒ Column

The Spark SQL sha function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sha1(*cols) ⇒ Column

The Spark SQL sha1 function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sha2(col, num_bits) ⇒ Column

Returns SHA-2 hash with the given bit length (224/256/384/512).

Returns:

  • (Column)

    SHA-2 hash with the given bit length (224/256/384/512).



143
144
# File 'lib/spark_connect/functions.rb', line 143

def sha2(col, num_bits) = Column.invoke("sha2", _col(col), lit(num_bits))
# @return [Column] convert a number string from `from_base` to `to_base`.

#shiftleft(col, num_bits) ⇒ Column

Returns left shift / right shift by literal bit counts.

Returns:

  • (Column)

    left shift / right shift by literal bit counts.



147
# File 'lib/spark_connect/functions.rb', line 147

def shiftleft(col, num_bits) = Column.invoke("shiftleft", _col(col), lit(num_bits))

#shiftright(col, num_bits) ⇒ Object



148
# File 'lib/spark_connect/functions.rb', line 148

def shiftright(col, num_bits) = Column.invoke("shiftright", _col(col), lit(num_bits))

#shiftrightunsigned(col, num_bits) ⇒ Object



149
# File 'lib/spark_connect/functions.rb', line 149

def shiftrightunsigned(col, num_bits) = Column.invoke("shiftrightunsigned", _col(col), lit(num_bits))

#shuffle(*cols) ⇒ Column

The Spark SQL shuffle function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#signum(*cols) ⇒ Column

The Spark SQL signum function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sin(*cols) ⇒ Column

The Spark SQL sin function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sinh(*cols) ⇒ Column

The Spark SQL sinh function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#size(*cols) ⇒ Column

The Spark SQL size function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#skewness(*cols) ⇒ Column

The Spark SQL skewness function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#slice(col, start, length) ⇒ Object



215
# File 'lib/spark_connect/functions.rb', line 215

def slice(col, start, length) = Column.invoke("slice", _col(col), _lit_or_col(start), _lit_or_col(length))

#some(*cols) ⇒ Column

The Spark SQL some function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sort_array(col, asc = true) ⇒ Object

---- Sorting helpers ---------------------------------------------------



232
# File 'lib/spark_connect/functions.rb', line 232

def sort_array(col, asc = true) = Column.invoke("sort_array", _col(col), lit(asc))

#soundex(*cols) ⇒ Column

The Spark SQL soundex function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#spark_partition_idColumn

The Spark SQL spark_partition_id function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#split(col, pattern, limit = -1)) ⇒ Column

Returns split col by the literal regex pattern.

Returns:

  • (Column)

    split col by the literal regex pattern.



127
128
# File 'lib/spark_connect/functions.rb', line 127

def split(col, pattern, limit = -1) = Column.invoke("split", _col(col), lit(pattern), lit(limit))
# @return [Column] characters of `col` matching `matching` replaced per `replace`.

#sqrt(*cols) ⇒ Column

The Spark SQL sqrt function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#stddev(*cols) ⇒ Column

The Spark SQL stddev function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#stddev_pop(*cols) ⇒ Column

The Spark SQL stddev_pop function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#stddev_samp(*cols) ⇒ Column

The Spark SQL stddev_samp function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#struct(*cols) ⇒ Column

Returns a struct from the given columns.

Returns:

  • (Column)

    a struct from the given columns.



94
95
# File 'lib/spark_connect/functions.rb', line 94

def struct(*cols) = Column.invoke("struct", *cols.map { |c| _col(c) })
# @return [Column] an array from the given columns.

#substring(col, pos, len) ⇒ Column

Returns substring of length len from 1-based pos.

Returns:

  • (Column)

    substring of length len from 1-based pos.



113
114
# File 'lib/spark_connect/functions.rb', line 113

def substring(col, pos, len) = Column.invoke("substring", _col(col), lit(pos), lit(len))
# @return [Column] substring before the `count`-th occurrence of `delim`.

#substring_index(col, delim, count) ⇒ Column

Returns substring before the count-th occurrence of delim.

Returns:

  • (Column)

    substring before the count-th occurrence of delim.



115
116
# File 'lib/spark_connect/functions.rb', line 115

def substring_index(col, delim, count) = Column.invoke("substring_index", _col(col), lit(delim), lit(count))
# @return [Column] 1-based position of literal `substr` within `col` (0 if absent).

#sum(*cols) ⇒ Column

The Spark SQL sum function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sum_distinct(col) ⇒ Column

Returns sum of distinct values.

Returns:

  • (Column)

    sum of distinct values.



75
# File 'lib/spark_connect/functions.rb', line 75

def sum_distinct(col) = Column.invoke("sum", _col(col), is_distinct: true)

#tan(*cols) ⇒ Column

The Spark SQL tan function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#tanh(*cols) ⇒ Column

The Spark SQL tanh function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#timestamp_micros(*cols) ⇒ Column

The Spark SQL timestamp_micros function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#timestamp_millis(*cols) ⇒ Column

The Spark SQL timestamp_millis function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#timestamp_seconds(*cols) ⇒ Column

The Spark SQL timestamp_seconds function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#to_date(col, fmt = nil) ⇒ Object



154
# File 'lib/spark_connect/functions.rb', line 154

def to_date(col, fmt = nil) = fmt ? Column.invoke("to_date", _col(col), lit(fmt)) : Column.invoke("to_date", _col(col))

#to_json(col, options = {}) ⇒ Object



186
187
188
189
# File 'lib/spark_connect/functions.rb', line 186

def to_json(col, options = {})
  args = [_col(col)] + options.flat_map { |k, v| [lit(k.to_s), lit(v.to_s)] }
  Column.invoke("to_json", *args)
end

#to_timestamp(col, fmt = nil) ⇒ Object



155
# File 'lib/spark_connect/functions.rb', line 155

def to_timestamp(col, fmt = nil) = fmt ? Column.invoke("to_timestamp", _col(col), lit(fmt)) : Column.invoke("to_timestamp", _col(col))

#to_utc_timestamp(col, tz) ⇒ Object



171
# File 'lib/spark_connect/functions.rb', line 171

def to_utc_timestamp(col, tz) = Column.invoke("to_utc_timestamp", _col(col), lit(tz))

#transform(col) {|element| ... } ⇒ Column

Transform each element of an array. The block receives a Column (and optionally the index) and returns a Column.

Yield Parameters:

Returns:



245
# File 'lib/spark_connect/functions.rb', line 245

def transform(col, &block) = Column.invoke("transform", _col(col), _lambda(block))

#transform_keys(col, &block) ⇒ Object



250
# File 'lib/spark_connect/functions.rb', line 250

def transform_keys(col, &block) = Column.invoke("transform_keys", _col(col), _lambda(block))

#transform_values(col, &block) ⇒ Object



251
# File 'lib/spark_connect/functions.rb', line 251

def transform_values(col, &block) = Column.invoke("transform_values", _col(col), _lambda(block))

#translate(col, matching, replace) ⇒ Column

Returns characters of col matching matching replaced per replace.

Returns:

  • (Column)

    characters of col matching matching replaced per replace.



129
130
# File 'lib/spark_connect/functions.rb', line 129

def translate(col, matching, replace) = Column.invoke("translate", _col(col), lit(matching), lit(replace))
# @return [Column] the `idx`-th group of `pattern` matched in `col`.

#trim(*cols) ⇒ Column

The Spark SQL trim function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#trunc(col, fmt) ⇒ Object



162
# File 'lib/spark_connect/functions.rb', line 162

def trunc(col, fmt) = Column.invoke("trunc", _col(col), lit(fmt))

#typeof(*cols) ⇒ Column

The Spark SQL typeof function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#ucase(*cols) ⇒ Column

The Spark SQL ucase function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#udfObject

UDFs require a server-side execution environment (Python/Scala) and are not supported by the pure-Ruby client.



273
274
275
# File 'lib/spark_connect/functions.rb', line 273

def udf(*)
  raise NotImplementedError, "User-defined functions are not supported by the Ruby Spark Connect client"
end

#unbase64(*cols) ⇒ Column

The Spark SQL unbase64 function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unhex(*cols) ⇒ Column

The Spark SQL unhex function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_date(*cols) ⇒ Column

The Spark SQL unix_date function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_micros(*cols) ⇒ Column

The Spark SQL unix_micros function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_millis(*cols) ⇒ Column

The Spark SQL unix_millis function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_seconds(*cols) ⇒ Column

The Spark SQL unix_seconds function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_timestamp(col = nil, fmt = "yyyy-MM-dd HH:mm:ss") ⇒ Object



166
167
168
# File 'lib/spark_connect/functions.rb', line 166

def unix_timestamp(col = nil, fmt = "yyyy-MM-dd HH:mm:ss")
  col.nil? ? Column.invoke("unix_timestamp") : Column.invoke("unix_timestamp", _col(col), lit(fmt))
end

#upper(*cols) ⇒ Column

The Spark SQL upper function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#uuidColumn

The Spark SQL uuid function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#var_pop(*cols) ⇒ Column

The Spark SQL var_pop function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#var_samp(*cols) ⇒ Column

The Spark SQL var_samp function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#variance(*cols) ⇒ Column

The Spark SQL variance function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#versionColumn

The Spark SQL version function (takes no arguments).

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#weekday(*cols) ⇒ Column

The Spark SQL weekday function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#weekofyear(*cols) ⇒ Column

The Spark SQL weekofyear function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#when(condition, value) ⇒ Column

Start a CASE WHEN expression. Chain Column#when / Column#otherwise.

Returns:



51
52
53
# File 'lib/spark_connect/functions.rb', line 51

def when(condition, value)
  Column.invoke("when", condition, value)
end

#xxhash64(*cols) ⇒ Column

The Spark SQL xxhash64 function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#year(*cols) ⇒ Column

The Spark SQL year function. String arguments are treated as column names.

Returns:



822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#zip_with(left, right, &block) ⇒ Object



249
# File 'lib/spark_connect/functions.rb', line 249

def zip_with(left, right, &block) = Column.invoke("zip_with", _col(left), _col(right), _lambda(block))