Module: SparkConnect::Functions

Extended by:: Functions

Included in:: Functions

Defined in:: lib/spark_connect/functions.rb

Overview

The standard Spark SQL function library, mirroring PySpark's pyspark.sql.functions. Every function returns a Column.

Available both as SparkConnect::Functions and the shorthand SparkConnect::F. All methods are module functions.

Following PySpark's convention, a String argument denotes a column name for most functions (e.g. F.sum("salary") aggregates the salary column), while functions whose parameters are genuinely literal (regex patterns, date formats, JSON paths, ...) treat their String arguments as literal values.

Examples:

F = SparkConnect::F
F.col("a") + F.lit(1)
F.when(F.col("x") > 0, "pos").otherwise("non-pos")
F.sum("amount").alias("total")

Constant Summary collapse

Proto =

SparkConnect::Proto

UNIFORM = The following functions are generated programmatically below (UNIFORM and NO_ARG). The @!method directives document them so they appear in the API reference; each returns a Column. ---- Generated uniform functions -------------------------------------- Functions whose arguments are all ColumnOrName (a String denotes a column name). Defined programmatically to keep the surface complete and compact.

%w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

NO_ARG = No-argument functions.

%w[
  current_date current_timestamp now current_timezone current_user current_catalog
  current_database current_schema monotonically_increasing_id spark_partition_id
  input_file_name input_file_block_start input_file_block_length version uuid
  row_number rank dense_rank percent_rank cume_dist
].freeze

Class Attribute Summary collapse

.lambda_counter ⇒ Object private

Instance Method Summary collapse

#_col(value) ⇒ Object private
ColumnOrName coercion: String/Symbol -> column reference, Column -> itself, everything else -> literal.
#_lambda(block) ⇒ Object private
Build a Column wrapping a LambdaFunction from a Ruby block.
#_lit_or_col(value) ⇒ Object private
#abs(*cols) ⇒ Column
The Spark SQL abs function.
#acos(*cols) ⇒ Column
The Spark SQL acos function.
#acosh(*cols) ⇒ Column
The Spark SQL acosh function.
#add_months(col, months) ⇒ Object
#aggregate(col, initial, merge, finish = nil) ⇒ Column
Aggregate (fold) an array.
#any_value(*cols) ⇒ Column
The Spark SQL any_value function.
#approx_count_distinct(col, rsd = nil) ⇒ Column
Approximate distinct count (optionally with relative SD).
#array(*cols) ⇒ Column
An array from the given columns.
#array_append(col, value) ⇒ Object
#array_compact(*cols) ⇒ Column
The Spark SQL array_compact function.
#array_contains(col, value) ⇒ Object
---- Array / map functions with value arguments -----------------------.
#array_distinct(*cols) ⇒ Column
The Spark SQL array_distinct function.
#array_except(*cols) ⇒ Column
The Spark SQL array_except function.
#array_insert(col, pos, value) ⇒ Object
#array_intersect(*cols) ⇒ Column
The Spark SQL array_intersect function.
#array_join(col, delimiter, null_replacement = nil) ⇒ Object
#array_max(*cols) ⇒ Column
The Spark SQL array_max function.
#array_min(*cols) ⇒ Column
The Spark SQL array_min function.
#array_position(col, value) ⇒ Object
#array_prepend(col, value) ⇒ Object
#array_remove(col, element) ⇒ Object
#array_repeat(col, count) ⇒ Object
#array_sort(*cols) ⇒ Column
The Spark SQL array_sort function.
#array_union(*cols) ⇒ Column
The Spark SQL array_union function.
#arrays_overlap(*cols) ⇒ Column
The Spark SQL arrays_overlap function.
#arrays_zip(*cols) ⇒ Column
The Spark SQL arrays_zip function.
#asc(col) ⇒ Column
An ascending sort order for the named/given column.
#asc_nulls_first(col) ⇒ Object
#asc_nulls_last(col) ⇒ Object
#ascii(*cols) ⇒ Column
The Spark SQL ascii function.
#asin(*cols) ⇒ Column
The Spark SQL asin function.
#asinh(*cols) ⇒ Column
The Spark SQL asinh function.
#atan(*cols) ⇒ Column
The Spark SQL atan function.
#atan2(*cols) ⇒ Column
The Spark SQL atan2 function.
#atanh(*cols) ⇒ Column
The Spark SQL atanh function.
#avg(*cols) ⇒ Column
The Spark SQL avg function.
#base64(*cols) ⇒ Column
The Spark SQL base64 function.
#bin(*cols) ⇒ Column
The Spark SQL bin function.
#bit_and(*cols) ⇒ Column
The Spark SQL bit_and function.
#bit_count(*cols) ⇒ Column
The Spark SQL bit_count function.
#bit_length(*cols) ⇒ Column
The Spark SQL bit_length function.
#bit_or(*cols) ⇒ Column
The Spark SQL bit_or function.
#bit_xor(*cols) ⇒ Column
The Spark SQL bit_xor function.
#bitwise_not(*cols) ⇒ Column
The Spark SQL bitwise_not function.
#bool_and(*cols) ⇒ Column
The Spark SQL bool_and function.
#bool_or(*cols) ⇒ Column
The Spark SQL bool_or function.
#broadcast(df) ⇒ DataFrame
Mark a DataFrame for broadcast (map-side) join.
#bround(col, scale = 0) ⇒ Column
HALF_EVEN ("banker's") rounding to scale places.
#cardinality(*cols) ⇒ Column
The Spark SQL cardinality function.
#cbrt(*cols) ⇒ Column
The Spark SQL cbrt function.
#ceil(*cols) ⇒ Column
The Spark SQL ceil function.
#ceiling(*cols) ⇒ Column
The Spark SQL ceiling function.
#char_length(*cols) ⇒ Column
The Spark SQL char_length function.
#character_length(*cols) ⇒ Column
The Spark SQL character_length function.
#coalesce(*cols) ⇒ Column
First non-null among the given columns.
#col(name) ⇒ Column (also: #column)
A column reference by name.
#collect_list(*cols) ⇒ Column
The Spark SQL collect_list function.
#collect_set(*cols) ⇒ Column
The Spark SQL collect_set function.
#concat(*cols) ⇒ Column
The Spark SQL concat function.
#concat_ws(sep, *cols) ⇒ Column
Concatenation of columns separated by literal sep.
#conv(col, from_base, to_base) ⇒ Column
Convert a number string from from_base to to_base.
#corr(*cols) ⇒ Column
The Spark SQL corr function.
#cos(*cols) ⇒ Column
The Spark SQL cos function.
#cosh(*cols) ⇒ Column
The Spark SQL cosh function.
#cot(*cols) ⇒ Column
The Spark SQL cot function.
#count(col) ⇒ Column
Count of rows (or non-null values of a column).
#count_distinct(*cols) ⇒ Column (also: #countDistinct)
Count of distinct combinations of the given columns.
#count_if(*cols) ⇒ Column
The Spark SQL count_if function.
#covar_pop(*cols) ⇒ Column
The Spark SQL covar_pop function.
#covar_samp(*cols) ⇒ Column
The Spark SQL covar_samp function.
#crc32(*cols) ⇒ Column
The Spark SQL crc32 function.
#create_map(*cols) ⇒ Column
A map from alternating key/value columns.
#csc(*cols) ⇒ Column
The Spark SQL csc function.
#cume_dist ⇒ Column
The Spark SQL cume_dist function (takes no arguments).
#current_catalog ⇒ Column
The Spark SQL current_catalog function (takes no arguments).
#current_database ⇒ Column
The Spark SQL current_database function (takes no arguments).
#current_date ⇒ Column
The Spark SQL current_date function (takes no arguments).
#current_schema ⇒ Column
The Spark SQL current_schema function (takes no arguments).
#current_timestamp ⇒ Column
The Spark SQL current_timestamp function (takes no arguments).
#current_timezone ⇒ Column
The Spark SQL current_timezone function (takes no arguments).
#current_user ⇒ Column
The Spark SQL current_user function (takes no arguments).
#date_add(col, days) ⇒ Object
#date_format(col, fmt) ⇒ Object
---- Date / time functions with literal arguments ---------------------.
#date_from_unix_date(*cols) ⇒ Column
The Spark SQL date_from_unix_date function.
#date_sub(col, days) ⇒ Object
#date_trunc(fmt, col) ⇒ Object
#datediff(end_col, start_col) ⇒ Object
#day(*cols) ⇒ Column
The Spark SQL day function.
#dayofmonth(*cols) ⇒ Column
The Spark SQL dayofmonth function.
#dayofweek(*cols) ⇒ Column
The Spark SQL dayofweek function.
#dayofyear(*cols) ⇒ Column
The Spark SQL dayofyear function.
#degrees(*cols) ⇒ Column
The Spark SQL degrees function.
#dense_rank ⇒ Column
The Spark SQL dense_rank function (takes no arguments).
#desc(col) ⇒ Object
#desc_nulls_first(col) ⇒ Object
#desc_nulls_last(col) ⇒ Object
#element_at(col, extraction) ⇒ Object
#every(*cols) ⇒ Column
The Spark SQL every function.
#exists(col, &block) ⇒ Object
#exp(*cols) ⇒ Column
The Spark SQL exp function.
#explode(*cols) ⇒ Column
The Spark SQL explode function.
#explode_outer(*cols) ⇒ Column
The Spark SQL explode_outer function.
#expm1(*cols) ⇒ Column
The Spark SQL expm1 function.
#expr(sql) ⇒ Column
Parse a SQL expression string into a Column.
#factorial(*cols) ⇒ Column
The Spark SQL factorial function.
#filter(col, &block) ⇒ Object
#first(*cols) ⇒ Column
The Spark SQL first function.
#first_value(*cols) ⇒ Column
The Spark SQL first_value function.
#flatten(*cols) ⇒ Column
The Spark SQL flatten function.
#floor(*cols) ⇒ Column
The Spark SQL floor function.
#forall(col, &block) ⇒ Object
#format_number(col, d) ⇒ Column
Number formatted to d decimal places.
#format_string(fmt, *cols) ⇒ Column
Printf-style formatting using literal fmt.
#from_json(col, schema, options = {}) ⇒ Object
#from_unixtime(col, fmt = "yyyy-MM-dd HH:mm:ss") ⇒ Object
#from_utc_timestamp(col, tz) ⇒ Object
#get_json_object(col, path) ⇒ Object
---- JSON / CSV --------------------------------------------------------.
#greatest(*cols) ⇒ Column
The Spark SQL greatest function.
#grouping(*cols) ⇒ Column
The Spark SQL grouping function.
#hash(*cols) ⇒ Column
The Spark SQL hash function.
#hex(*cols) ⇒ Column
The Spark SQL hex function.
#hour(*cols) ⇒ Column
The Spark SQL hour function.
#hypot(*cols) ⇒ Column
The Spark SQL hypot function.
#initcap(*cols) ⇒ Column
The Spark SQL initcap function.
#inline(*cols) ⇒ Column
The Spark SQL inline function.
#inline_outer(*cols) ⇒ Column
The Spark SQL inline_outer function.
#input_file_block_length ⇒ Column
The Spark SQL input_file_block_length function (takes no arguments).
#input_file_block_start ⇒ Column
The Spark SQL input_file_block_start function (takes no arguments).
#input_file_name ⇒ Column
The Spark SQL input_file_name function (takes no arguments).
#instr(col, substr) ⇒ Column
1-based position of literal substr within col (0 if absent).
#isnan(*cols) ⇒ Column
The Spark SQL isnan function.
#isnull(*cols) ⇒ Column
The Spark SQL isnull function.
#json_tuple(col, *fields) ⇒ Object
#kurtosis(*cols) ⇒ Column
The Spark SQL kurtosis function.
#lag(col, offset = 1, default = nil) ⇒ Object
---- Window / analytic functions --------------------------------------.
#last(*cols) ⇒ Column
The Spark SQL last function.
#last_day(*cols) ⇒ Column
The Spark SQL last_day function.
#last_value(*cols) ⇒ Column
The Spark SQL last_value function.
#lcase(*cols) ⇒ Column
The Spark SQL lcase function.
#lead(col, offset = 1, default = nil) ⇒ Object
#least(*cols) ⇒ Column
The Spark SQL least function.
#length(*cols) ⇒ Column
The Spark SQL length function.
#lit(value) ⇒ Column
A literal value column.
#ln(*cols) ⇒ Column
The Spark SQL ln function.
#locate(substr, col, pos = 1) ⇒ Column
1-based position of substr in col at/after pos.
#log(*cols) ⇒ Column
The Spark SQL log function.
#log10(*cols) ⇒ Column
The Spark SQL log10 function.
#log1p(*cols) ⇒ Column
The Spark SQL log1p function.
#log2(*cols) ⇒ Column
The Spark SQL log2 function.
#lower(*cols) ⇒ Column
The Spark SQL lower function.
#lpad(col, len, pad) ⇒ Column
Left-padded string.
#ltrim(*cols) ⇒ Column
The Spark SQL ltrim function.
#make_date(year, month, day) ⇒ Object
#map_concat(*cols) ⇒ Column
The Spark SQL map_concat function.
#map_contains_key(col, key) ⇒ Object
#map_entries(*cols) ⇒ Column
The Spark SQL map_entries function.
#map_filter(col, &block) ⇒ Object
#map_from_arrays(keys, values) ⇒ Column
A map from two array columns (keys, values).
#map_from_entries(*cols) ⇒ Column
The Spark SQL map_from_entries function.
#map_keys(*cols) ⇒ Column
The Spark SQL map_keys function.
#map_values(*cols) ⇒ Column
The Spark SQL map_values function.
#map_zip_with(c1, c2, &block) ⇒ Object
#max(*cols) ⇒ Column
The Spark SQL max function.
#max_by(*cols) ⇒ Column
The Spark SQL max_by function.
#md5(*cols) ⇒ Column
The Spark SQL md5 function.
#mean(*cols) ⇒ Column
The Spark SQL mean function.
#median(*cols) ⇒ Column
The Spark SQL median function.
#min(*cols) ⇒ Column
The Spark SQL min function.
#min_by(*cols) ⇒ Column
The Spark SQL min_by function.
#minute(*cols) ⇒ Column
The Spark SQL minute function.
#mode(*cols) ⇒ Column
The Spark SQL mode function.
#monotonically_increasing_id ⇒ Column
The Spark SQL monotonically_increasing_id function (takes no arguments).
#month(*cols) ⇒ Column
The Spark SQL month function.
#months_between(d1, d2, round_off = true) ⇒ Object
#named_struct(*cols) ⇒ Column
A named struct from alternating name/value arguments.
#nanvl(col1, col2) ⇒ Column
col1 if it is not NaN, else col2.
#negate(*cols) ⇒ Column
The Spark SQL negate function.
#negative(*cols) ⇒ Column
The Spark SQL negative function.
#next_day(col, day_of_week) ⇒ Object
#now ⇒ Column
The Spark SQL now function (takes no arguments).
#nth_value(col, offset, ignore_nulls = false) ⇒ Object
#ntile(n) ⇒ Object
#octet_length(*cols) ⇒ Column
The Spark SQL octet_length function.
#overlay(col, replace, pos, len = -1)) ⇒ Column
Overlay replace into col at pos for len chars.
#percent_rank ⇒ Column
The Spark SQL percent_rank function (takes no arguments).
#pmod(*cols) ⇒ Column
The Spark SQL pmod function.
#posexplode(*cols) ⇒ Column
The Spark SQL posexplode function.
#posexplode_outer(*cols) ⇒ Column
The Spark SQL posexplode_outer function.
#positive(*cols) ⇒ Column
The Spark SQL positive function.
#pow(*cols) ⇒ Column
The Spark SQL pow function.
#power(*cols) ⇒ Column
The Spark SQL power function.
#product(*cols) ⇒ Column
The Spark SQL product function.
#quarter(*cols) ⇒ Column
The Spark SQL quarter function.
#radians(*cols) ⇒ Column
The Spark SQL radians function.
#rand(seed = nil) ⇒ Object
---- Randomness --------------------------------------------------------.
#randn(seed = nil) ⇒ Object
#rank ⇒ Column
The Spark SQL rank function (takes no arguments).
#regexp_count(col, pattern) ⇒ Object
#regexp_extract(col, pattern, idx = 0) ⇒ Column
The idx-th group of pattern matched in col.
#regexp_extract_all(col, pattern, idx = 1) ⇒ Column
All matches of group idx of pattern.
#regexp_like(col, pattern) ⇒ Column
Whether col matches pattern.
#regexp_replace(col, pattern, replacement) ⇒ Column
col with pattern replaced by replacement.
#regexp_substr(col, pattern) ⇒ Object
#repeat(col, n) ⇒ Column
The string repeated n times.
#reverse(*cols) ⇒ Column
The Spark SQL reverse function.
#rint(*cols) ⇒ Column
The Spark SQL rint function.
#round(col, scale = 0) ⇒ Column
HALF_UP rounding to scale decimal places.
#row_number ⇒ Column
The Spark SQL row_number function (takes no arguments).
#rpad(col, len, pad) ⇒ Column
Right-padded string.
#rtrim(*cols) ⇒ Column
The Spark SQL rtrim function.
#schema_of_json(json, options = {}) ⇒ Object
#sec(*cols) ⇒ Column
The Spark SQL sec function.
#second(*cols) ⇒ Column
The Spark SQL second function.
#sequence(start, stop, step = nil) ⇒ Object
#sha(*cols) ⇒ Column
The Spark SQL sha function.
#sha1(*cols) ⇒ Column
The Spark SQL sha1 function.
#sha2(col, num_bits) ⇒ Column
SHA-2 hash with the given bit length (224/256/384/512).
#shiftleft(col, num_bits) ⇒ Column
Left shift / right shift by literal bit counts.
#shiftright(col, num_bits) ⇒ Object
#shiftrightunsigned(col, num_bits) ⇒ Object
#shuffle(*cols) ⇒ Column
The Spark SQL shuffle function.
#signum(*cols) ⇒ Column
The Spark SQL signum function.
#sin(*cols) ⇒ Column
The Spark SQL sin function.
#sinh(*cols) ⇒ Column
The Spark SQL sinh function.
#size(*cols) ⇒ Column
The Spark SQL size function.
#skewness(*cols) ⇒ Column
The Spark SQL skewness function.
#slice(col, start, length) ⇒ Object
#some(*cols) ⇒ Column
The Spark SQL some function.
#sort_array(col, asc = true) ⇒ Object
---- Sorting helpers ---------------------------------------------------.
#soundex(*cols) ⇒ Column
The Spark SQL soundex function.
#spark_partition_id ⇒ Column
The Spark SQL spark_partition_id function (takes no arguments).
#split(col, pattern, limit = -1)) ⇒ Column
Split col by the literal regex pattern.
#sqrt(*cols) ⇒ Column
The Spark SQL sqrt function.
#stddev(*cols) ⇒ Column
The Spark SQL stddev function.
#stddev_pop(*cols) ⇒ Column
The Spark SQL stddev_pop function.
#stddev_samp(*cols) ⇒ Column
The Spark SQL stddev_samp function.
#struct(*cols) ⇒ Column
A struct from the given columns.
#substring(col, pos, len) ⇒ Column
Substring of length len from 1-based pos.
#substring_index(col, delim, count) ⇒ Column
Substring before the count-th occurrence of delim.
#sum(*cols) ⇒ Column
The Spark SQL sum function.
#sum_distinct(col) ⇒ Column
Sum of distinct values.
#tan(*cols) ⇒ Column
The Spark SQL tan function.
#tanh(*cols) ⇒ Column
The Spark SQL tanh function.
#timestamp_micros(*cols) ⇒ Column
The Spark SQL timestamp_micros function.
#timestamp_millis(*cols) ⇒ Column
The Spark SQL timestamp_millis function.
#timestamp_seconds(*cols) ⇒ Column
The Spark SQL timestamp_seconds function.
#to_date(col, fmt = nil) ⇒ Object
#to_json(col, options = {}) ⇒ Object
#to_timestamp(col, fmt = nil) ⇒ Object
#to_utc_timestamp(col, tz) ⇒ Object
#transform(col) {|element| ... } ⇒ Column
Transform each element of an array.
#transform_keys(col, &block) ⇒ Object
#transform_values(col, &block) ⇒ Object
#translate(col, matching, replace) ⇒ Column
Characters of col matching matching replaced per replace.
#trim(*cols) ⇒ Column
The Spark SQL trim function.
#trunc(col, fmt) ⇒ Object
#typeof(*cols) ⇒ Column
The Spark SQL typeof function.
#ucase(*cols) ⇒ Column
The Spark SQL ucase function.
#udf ⇒ Object
UDFs require a server-side execution environment (Python/Scala) and are not supported by the pure-Ruby client.
#unbase64(*cols) ⇒ Column
The Spark SQL unbase64 function.
#unhex(*cols) ⇒ Column
The Spark SQL unhex function.
#unix_date(*cols) ⇒ Column
The Spark SQL unix_date function.
#unix_micros(*cols) ⇒ Column
The Spark SQL unix_micros function.
#unix_millis(*cols) ⇒ Column
The Spark SQL unix_millis function.
#unix_seconds(*cols) ⇒ Column
The Spark SQL unix_seconds function.
#unix_timestamp(col = nil, fmt = "yyyy-MM-dd HH:mm:ss") ⇒ Object
#upper(*cols) ⇒ Column
The Spark SQL upper function.
#uuid ⇒ Column
The Spark SQL uuid function (takes no arguments).
#var_pop(*cols) ⇒ Column
The Spark SQL var_pop function.
#var_samp(*cols) ⇒ Column
The Spark SQL var_samp function.
#variance(*cols) ⇒ Column
The Spark SQL variance function.
#version ⇒ Column
The Spark SQL version function (takes no arguments).
#weekday(*cols) ⇒ Column
The Spark SQL weekday function.
#weekofyear(*cols) ⇒ Column
The Spark SQL weekofyear function.
#when(condition, value) ⇒ Column
Start a CASE WHEN expression.
#xxhash64(*cols) ⇒ Column
The Spark SQL xxhash64 function.
#year(*cols) ⇒ Column
The Spark SQL year function.
#zip_with(left, right, &block) ⇒ Object

Class Attribute Details

.lambda_counter ⇒ `Object`

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



880
881
882

# File 'lib/spark_connect/functions.rb', line 880

def lambda_counter
  @lambda_counter
end

Instance Method Details

#_col(value) ⇒ `Object`

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

ColumnOrName coercion: String/Symbol -> column reference, Column -> itself, everything else -> literal.

# File 'lib/spark_connect/functions.rb', line 863

def _col(value)
  case value
  when Column then value
  when String, Symbol then col(value.to_s)
  else lit(value)
  end
end

#_lambda(block) ⇒ `Object`

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Build a Column wrapping a LambdaFunction from a Ruby block. The block is called with one or more lambda-variable columns and must return a Column.

# File 'lib/spark_connect/functions.rb', line 886

def _lambda(block)
  arity = block.arity.negative? ? 1 : [block.arity, 1].max
  Functions.lambda_counter += 1
  names = (0...arity).map { |i| "x_#{Functions.lambda_counter}_#{i}" }
  vars = names.map do |n|
    Proto::Expression::UnresolvedNamedLambdaVariable.new(name_parts: [n])
  end
  cols = vars.map { |v| Column.new(Proto::Expression.new(unresolved_named_lambda_variable: v)) }
  body = block.call(*cols)
  Column.new(Proto::Expression.new(
               lambda_function: Proto::Expression::LambdaFunction.new(function: body.to_expr, arguments: vars)
             ))
end

#_lit_or_col(value) ⇒ `Object`

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



872
873
874

# File 'lib/spark_connect/functions.rb', line 872

def _lit_or_col(value)
  value.is_a?(Column) ? value : lit(value)
end

#abs(*cols) ⇒ `Column`

The Spark SQL abs function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#acos(*cols) ⇒ `Column`

The Spark SQL acos function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#acosh(*cols) ⇒ `Column`

The Spark SQL acosh function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#add_months(col, months) ⇒ `Object`

159	# File 'lib/spark_connect/functions.rb', line 159 def add_months(col, months) = Column.invoke("add_months", _col(col), lit(months))

#aggregate(col, initial, merge, finish = nil) ⇒ `Column`

Aggregate (fold) an array. merge combines accumulator and element; optional finish post-processes the result.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 258

def aggregate(col, initial, merge, finish = nil)
  args = [_col(col), _col(initial), _lambda(merge)]
  args << _lambda(finish) if finish
  Column.invoke("aggregate", *args)
end

#any_value(*cols) ⇒ `Column`

The Spark SQL any_value function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#approx_count_distinct(col, rsd = nil) ⇒ `Column`

Returns approximate distinct count (optionally with relative SD).

Returns:

(Column) —
approximate distinct count (optionally with relative SD).



70
71
72

# File 'lib/spark_connect/functions.rb', line 70

def approx_count_distinct(col, rsd = nil)
  rsd.nil? ? Column.invoke("approx_count_distinct", _col(col)) : Column.invoke("approx_count_distinct", _col(col), lit(rsd))
end

#array(*cols) ⇒ `Column`

Returns an array from the given columns.

Returns:

(Column) —
an array from the given columns.



96
97

# File 'lib/spark_connect/functions.rb', line 96

def array(*cols) = Column.invoke("array", *cols.map { |c| _col(c) })
# @return [Column] a map from alternating key/value columns.

#array_append(col, value) ⇒ `Object`

201	# File 'lib/spark_connect/functions.rb', line 201 def array_append(col, value) = Column.invoke("array_append", _col(col), lit(value))

#array_compact(*cols) ⇒ `Column`

The Spark SQL array_compact function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_contains(col, value) ⇒ `Object`

---- Array / map functions with value arguments -----------------------

197	# File 'lib/spark_connect/functions.rb', line 197 def array_contains(col, value) = Column.invoke("array_contains", _col(col), lit(value))

#array_distinct(*cols) ⇒ `Column`

The Spark SQL array_distinct function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_except(*cols) ⇒ `Column`

The Spark SQL array_except function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_insert(col, pos, value) ⇒ `Object`

203	# File 'lib/spark_connect/functions.rb', line 203 def array_insert(col, pos, value) = Column.invoke("array_insert", _col(col), lit(pos), lit(value))

#array_intersect(*cols) ⇒ `Column`

The Spark SQL array_intersect function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_join(col, delimiter, null_replacement = nil) ⇒ `Object`

# File 'lib/spark_connect/functions.rb', line 205

def array_join(col, delimiter, null_replacement = nil)
  if null_replacement.nil?
    Column.invoke("array_join", _col(col),
                  lit(delimiter))
  else
    Column.invoke("array_join", _col(col), lit(delimiter), lit(null_replacement))
  end
end

#array_max(*cols) ⇒ `Column`

The Spark SQL array_max function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_min(*cols) ⇒ `Column`

The Spark SQL array_min function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_position(col, value) ⇒ `Object`

198	# File 'lib/spark_connect/functions.rb', line 198 def array_position(col, value) = Column.invoke("array_position", _col(col), lit(value))

#array_prepend(col, value) ⇒ `Object`

202	# File 'lib/spark_connect/functions.rb', line 202 def array_prepend(col, value) = Column.invoke("array_prepend", _col(col), lit(value))

#array_remove(col, element) ⇒ `Object`

199	# File 'lib/spark_connect/functions.rb', line 199 def array_remove(col, element) = Column.invoke("array_remove", _col(col), lit(element))

#array_repeat(col, count) ⇒ `Object`

200	# File 'lib/spark_connect/functions.rb', line 200 def array_repeat(col, count) = Column.invoke("array_repeat", _col(col), lit(count))

#array_sort(*cols) ⇒ `Column`

The Spark SQL array_sort function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_union(*cols) ⇒ `Column`

The Spark SQL array_union function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#arrays_overlap(*cols) ⇒ `Column`

The Spark SQL arrays_overlap function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#arrays_zip(*cols) ⇒ `Column`

The Spark SQL arrays_zip function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#asc(col) ⇒ `Column`

Returns an ascending sort order for the named/given column.

Returns:

(Column) —
an ascending sort order for the named/given column.

42	# File 'lib/spark_connect/functions.rb', line 42 def asc(col) = _col(col).asc

#asc_nulls_first(col) ⇒ `Object`

44	# File 'lib/spark_connect/functions.rb', line 44 def asc_nulls_first(col) = _col(col).asc_nulls_first

#asc_nulls_last(col) ⇒ `Object`

45	# File 'lib/spark_connect/functions.rb', line 45 def asc_nulls_last(col) = _col(col).asc_nulls_last

#ascii(*cols) ⇒ `Column`

The Spark SQL ascii function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#asin(*cols) ⇒ `Column`

The Spark SQL asin function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#asinh(*cols) ⇒ `Column`

The Spark SQL asinh function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#atan(*cols) ⇒ `Column`

The Spark SQL atan function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#atan2(*cols) ⇒ `Column`

The Spark SQL atan2 function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#atanh(*cols) ⇒ `Column`

The Spark SQL atanh function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#avg(*cols) ⇒ `Column`

The Spark SQL avg function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#base64(*cols) ⇒ `Column`

The Spark SQL base64 function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bin(*cols) ⇒ `Column`

The Spark SQL bin function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_and(*cols) ⇒ `Column`

The Spark SQL bit_and function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_count(*cols) ⇒ `Column`

The Spark SQL bit_count function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_length(*cols) ⇒ `Column`

The Spark SQL bit_length function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_or(*cols) ⇒ `Column`

The Spark SQL bit_or function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_xor(*cols) ⇒ `Column`

The Spark SQL bit_xor function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bitwise_not(*cols) ⇒ `Column`

The Spark SQL bitwise_not function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bool_and(*cols) ⇒ `Column`

The Spark SQL bool_and function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bool_or(*cols) ⇒ `Column`

The Spark SQL bool_or function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#broadcast(df) ⇒ `DataFrame`

Mark a DataFrame for broadcast (map-side) join.

Parameters:

df (DataFrame)

Returns:

(DataFrame)

269	# File 'lib/spark_connect/functions.rb', line 269 def broadcast(df) = df.hint("broadcast")

#bround(col, scale = 0) ⇒ `Column`

Returns HALF_EVEN ("banker's") rounding to scale places.

Returns:

(Column) —
HALF_EVEN ("banker's") rounding to scale places.

82	# File 'lib/spark_connect/functions.rb', line 82 def bround(col, scale = 0) = Column.invoke("bround", _col(col), lit(scale))

#cardinality(*cols) ⇒ `Column`

The Spark SQL cardinality function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cbrt(*cols) ⇒ `Column`

The Spark SQL cbrt function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#ceil(*cols) ⇒ `Column`

The Spark SQL ceil function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#ceiling(*cols) ⇒ `Column`

The Spark SQL ceiling function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#char_length(*cols) ⇒ `Column`

The Spark SQL char_length function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#character_length(*cols) ⇒ `Column`

The Spark SQL character_length function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#coalesce(*cols) ⇒ `Column`

Returns first non-null among the given columns.

Returns:

(Column) —
first non-null among the given columns.



87
88

# File 'lib/spark_connect/functions.rb', line 87

def coalesce(*cols) = Column.invoke("coalesce", *cols.map { |c| _col(c) })
# @return [Column] `col1` if it is not NaN, else `col2`.

#col(name) ⇒ `Column` Also known as: column

A column reference by name. "*" selects all columns.

Returns:

(Column)

28	# File 'lib/spark_connect/functions.rb', line 28 def col(name) = Column.from_name(name.to_s)

#collect_list(*cols) ⇒ `Column`

The Spark SQL collect_list function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#collect_set(*cols) ⇒ `Column`

The Spark SQL collect_set function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#concat(*cols) ⇒ `Column`

The Spark SQL concat function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#concat_ws(sep, *cols) ⇒ `Column`

Returns concatenation of columns separated by literal sep.

Returns:

(Column) —
concatenation of columns separated by literal sep.



107
108

# File 'lib/spark_connect/functions.rb', line 107

def concat_ws(sep, *cols) = Column.invoke("concat_ws", lit(sep), *cols.map { |c| _col(c) })
# @return [Column] printf-style formatting using literal `fmt`.

#conv(col, from_base, to_base) ⇒ `Column`

Returns convert a number string from from_base to to_base.

Returns:

(Column) —
convert a number string from from_base to to_base.



145
146

# File 'lib/spark_connect/functions.rb', line 145

def conv(col, from_base, to_base) = Column.invoke("conv", _col(col), lit(from_base), lit(to_base))
# @return [Column] left shift / right shift by literal bit counts.

#corr(*cols) ⇒ `Column`

The Spark SQL corr function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cos(*cols) ⇒ `Column`

The Spark SQL cos function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cosh(*cols) ⇒ `Column`

The Spark SQL cosh function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cot(*cols) ⇒ `Column`

The Spark SQL cot function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#count(col) ⇒ `Column`

Returns count of rows (or non-null values of a column). "*" counts all rows.

Returns:

(Column) —
count of rows (or non-null values of a column). "*" counts all rows.



59
60
61

# File 'lib/spark_connect/functions.rb', line 59

def count(col)
  col.to_s == "*" ? Column.invoke("count", lit(1)) : Column.invoke("count", _col(col))
end

#count_distinct(*cols) ⇒ `Column` Also known as: countDistinct

Returns count of distinct combinations of the given columns.

Returns:

(Column) —
count of distinct combinations of the given columns.



64
65
66

# File 'lib/spark_connect/functions.rb', line 64

def count_distinct(*cols)
  Column.invoke("count", *cols.map { |c| _col(c) }, is_distinct: true)
end

#count_if(*cols) ⇒ `Column`

The Spark SQL count_if function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#covar_pop(*cols) ⇒ `Column`

The Spark SQL covar_pop function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#covar_samp(*cols) ⇒ `Column`

The Spark SQL covar_samp function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#crc32(*cols) ⇒ `Column`

The Spark SQL crc32 function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#create_map(*cols) ⇒ `Column`

Returns a map from alternating key/value columns.

Returns:

(Column) —
a map from alternating key/value columns.



98
99

# File 'lib/spark_connect/functions.rb', line 98

def create_map(*cols) = Column.invoke("map", *cols.map { |c| _col(c) })
# @return [Column] a map from two array columns (keys, values).

#csc(*cols) ⇒ `Column`

The Spark SQL csc function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cume_dist ⇒ `Column`

The Spark SQL cume_dist function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_catalog ⇒ `Column`

The Spark SQL current_catalog function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_database ⇒ `Column`

The Spark SQL current_database function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_date ⇒ `Column`

The Spark SQL current_date function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_schema ⇒ `Column`

The Spark SQL current_schema function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_timestamp ⇒ `Column`

The Spark SQL current_timestamp function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_timezone ⇒ `Column`

The Spark SQL current_timezone function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_user ⇒ `Column`

The Spark SQL current_user function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#date_add(col, days) ⇒ `Object`

156	# File 'lib/spark_connect/functions.rb', line 156 def date_add(col, days) = Column.invoke("date_add", _col(col), lit(days))

#date_format(col, fmt) ⇒ `Object`

---- Date / time functions with literal arguments ---------------------

153	# File 'lib/spark_connect/functions.rb', line 153 def date_format(col, fmt) = Column.invoke("date_format", _col(col), lit(fmt))

#date_from_unix_date(*cols) ⇒ `Column`

The Spark SQL date_from_unix_date function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#date_sub(col, days) ⇒ `Object`

157	# File 'lib/spark_connect/functions.rb', line 157 def date_sub(col, days) = Column.invoke("date_sub", _col(col), lit(days))

#date_trunc(fmt, col) ⇒ `Object`

163	# File 'lib/spark_connect/functions.rb', line 163 def date_trunc(fmt, col) = Column.invoke("date_trunc", lit(fmt), _col(col))

#datediff(end_col, start_col) ⇒ `Object`

158	# File 'lib/spark_connect/functions.rb', line 158 def datediff(end_col, start_col) = Column.invoke("datediff", _col(end_col), _col(start_col))

#day(*cols) ⇒ `Column`

The Spark SQL day function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#dayofmonth(*cols) ⇒ `Column`

The Spark SQL dayofmonth function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#dayofweek(*cols) ⇒ `Column`

The Spark SQL dayofweek function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#dayofyear(*cols) ⇒ `Column`

The Spark SQL dayofyear function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#degrees(*cols) ⇒ `Column`

The Spark SQL degrees function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#dense_rank ⇒ `Column`

The Spark SQL dense_rank function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#desc(col) ⇒ `Object`

43	# File 'lib/spark_connect/functions.rb', line 43 def desc(col) = _col(col).desc

#desc_nulls_first(col) ⇒ `Object`

46	# File 'lib/spark_connect/functions.rb', line 46 def desc_nulls_first(col) = _col(col).desc_nulls_first

#desc_nulls_last(col) ⇒ `Object`

47	# File 'lib/spark_connect/functions.rb', line 47 def desc_nulls_last(col) = _col(col).desc_nulls_last

#element_at(col, extraction) ⇒ `Object`

214	# File 'lib/spark_connect/functions.rb', line 214 def element_at(col, extraction) = Column.invoke("element_at", _col(col), lit(extraction))

#every(*cols) ⇒ `Column`

The Spark SQL every function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#exists(col, &block) ⇒ `Object`

246	# File 'lib/spark_connect/functions.rb', line 246 def exists(col, &block) = Column.invoke("exists", _col(col), _lambda(block))

#exp(*cols) ⇒ `Column`

The Spark SQL exp function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#explode(*cols) ⇒ `Column`

The Spark SQL explode function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#explode_outer(*cols) ⇒ `Column`

The Spark SQL explode_outer function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#expm1(*cols) ⇒ `Column`

The Spark SQL expm1 function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#expr(sql) ⇒ `Column`

Parse a SQL expression string into a Column.

Returns:

(Column)



37
38
39

# File 'lib/spark_connect/functions.rb', line 37

def expr(sql)
  Column.from_expr(Proto::Expression.new(expression_string: Proto::Expression::ExpressionString.new(expression: sql)))
end

#factorial(*cols) ⇒ `Column`

The Spark SQL factorial function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#filter(col, &block) ⇒ `Object`

248	# File 'lib/spark_connect/functions.rb', line 248 def filter(col, &block) = Column.invoke("filter", _col(col), _lambda(block))

#first(*cols) ⇒ `Column`

The Spark SQL first function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#first_value(*cols) ⇒ `Column`

The Spark SQL first_value function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#flatten(*cols) ⇒ `Column`

The Spark SQL flatten function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#floor(*cols) ⇒ `Column`

The Spark SQL floor function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#forall(col, &block) ⇒ `Object`

247	# File 'lib/spark_connect/functions.rb', line 247 def forall(col, &block) = Column.invoke("forall", _col(col), _lambda(block))

#format_number(col, d) ⇒ `Column`

Returns number formatted to d decimal places.

Returns:

(Column) —
number formatted to d decimal places.



111
112

# File 'lib/spark_connect/functions.rb', line 111

def format_number(col, d) = Column.invoke("format_number", _col(col), lit(d))
# @return [Column] substring of length `len` from 1-based `pos`.

#format_string(fmt, *cols) ⇒ `Column`

Returns printf-style formatting using literal fmt.

Returns:

(Column) —
printf-style formatting using literal fmt.



109
110

# File 'lib/spark_connect/functions.rb', line 109

def format_string(fmt, *cols) = Column.invoke("format_string", lit(fmt), *cols.map { |c| _col(c) })
# @return [Column] number formatted to `d` decimal places.

#from_json(col, schema, options = {}) ⇒ `Object`

Parameters:

schema (Types::DataType, String)

# File 'lib/spark_connect/functions.rb', line 180

def from_json(col, schema, options = {})
  schema_col = schema.is_a?(Types::DataType) ? lit(schema.json) : lit(schema.to_s)
  args = [_col(col), schema_col] + options.flat_map { |k, v| [lit(k.to_s), lit(v.to_s)] }
  Column.invoke("from_json", *args)
end

#from_unixtime(col, fmt = "yyyy-MM-dd HH:mm:ss") ⇒ `Object`

164	# File 'lib/spark_connect/functions.rb', line 164 def from_unixtime(col, fmt = "yyyy-MM-dd HH:mm:ss") = Column.invoke("from_unixtime", _col(col), lit(fmt))

#from_utc_timestamp(col, tz) ⇒ `Object`

170	# File 'lib/spark_connect/functions.rb', line 170 def from_utc_timestamp(col, tz) = Column.invoke("from_utc_timestamp", _col(col), lit(tz))

#get_json_object(col, path) ⇒ `Object`

---- JSON / CSV --------------------------------------------------------

176	# File 'lib/spark_connect/functions.rb', line 176 def get_json_object(col, path) = Column.invoke("get_json_object", _col(col), lit(path))

#greatest(*cols) ⇒ `Column`

The Spark SQL greatest function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#grouping(*cols) ⇒ `Column`

The Spark SQL grouping function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#hash(*cols) ⇒ `Column`

The Spark SQL hash function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#hex(*cols) ⇒ `Column`

The Spark SQL hex function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#hour(*cols) ⇒ `Column`

The Spark SQL hour function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#hypot(*cols) ⇒ `Column`

The Spark SQL hypot function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#initcap(*cols) ⇒ `Column`

The Spark SQL initcap function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#inline(*cols) ⇒ `Column`

The Spark SQL inline function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#inline_outer(*cols) ⇒ `Column`

The Spark SQL inline_outer function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#input_file_block_length ⇒ `Column`

The Spark SQL input_file_block_length function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#input_file_block_start ⇒ `Column`

The Spark SQL input_file_block_start function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#input_file_name ⇒ `Column`

The Spark SQL input_file_name function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#instr(col, substr) ⇒ `Column`

Returns 1-based position of literal substr within col (0 if absent).

Returns:

(Column) —
1-based position of literal substr within col (0 if absent).



117
118

# File 'lib/spark_connect/functions.rb', line 117

def instr(col, substr) = Column.invoke("instr", _col(col), lit(substr))
# @return [Column] 1-based position of `substr` in `col` at/after `pos`.

#isnan(*cols) ⇒ `Column`

The Spark SQL isnan function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#isnull(*cols) ⇒ `Column`

The Spark SQL isnull function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#json_tuple(col, *fields) ⇒ `Object`

177	# File 'lib/spark_connect/functions.rb', line 177 def json_tuple(col, fields) = Column.invoke("json_tuple", _col(col), fields.map { \|f\| lit(f) })

#kurtosis(*cols) ⇒ `Column`

The Spark SQL kurtosis function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lag(col, offset = 1, default = nil) ⇒ `Object`

---- Window / analytic functions --------------------------------------

225	# File 'lib/spark_connect/functions.rb', line 225 def lag(col, offset = 1, default = nil) = Column.invoke("lag", _col(col), lit(offset), lit(default))

#last(*cols) ⇒ `Column`

The Spark SQL last function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#last_day(*cols) ⇒ `Column`

The Spark SQL last_day function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#last_value(*cols) ⇒ `Column`

The Spark SQL last_value function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lcase(*cols) ⇒ `Column`

The Spark SQL lcase function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lead(col, offset = 1, default = nil) ⇒ `Object`

226	# File 'lib/spark_connect/functions.rb', line 226 def lead(col, offset = 1, default = nil) = Column.invoke("lead", _col(col), lit(offset), lit(default))

#least(*cols) ⇒ `Column`

The Spark SQL least function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#length(*cols) ⇒ `Column`

The Spark SQL length function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lit(value) ⇒ `Column`

A literal value column. See Column.lit for supported Ruby types.

Returns:

(Column)

33	# File 'lib/spark_connect/functions.rb', line 33 def lit(value) = Column.lit(value)

#ln(*cols) ⇒ `Column`

The Spark SQL ln function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#locate(substr, col, pos = 1) ⇒ `Column`

Returns 1-based position of substr in col at/after pos.

Returns:

(Column) —
1-based position of substr in col at/after pos.



119
120

# File 'lib/spark_connect/functions.rb', line 119

def locate(substr, col, pos = 1) = Column.invoke("locate", lit(substr), _col(col), lit(pos))
# @return [Column] left-padded string.

#log(*cols) ⇒ `Column`

The Spark SQL log function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#log10(*cols) ⇒ `Column`

The Spark SQL log10 function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#log1p(*cols) ⇒ `Column`

The Spark SQL log1p function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#log2(*cols) ⇒ `Column`

The Spark SQL log2 function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lower(*cols) ⇒ `Column`

The Spark SQL lower function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lpad(col, len, pad) ⇒ `Column`

Returns left-padded string.

Returns:

(Column) —
left-padded string.



121
122

# File 'lib/spark_connect/functions.rb', line 121

def lpad(col, len, pad) = Column.invoke("lpad", _col(col), lit(len), lit(pad))
# @return [Column] right-padded string.

#ltrim(*cols) ⇒ `Column`

The Spark SQL ltrim function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#make_date(year, month, day) ⇒ `Object`

172	# File 'lib/spark_connect/functions.rb', line 172 def make_date(year, month, day) = Column.invoke("make_date", _col(year), _col(month), _col(day))

#map_concat(*cols) ⇒ `Column`

The Spark SQL map_concat function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_contains_key(col, key) ⇒ `Object`

221	# File 'lib/spark_connect/functions.rb', line 221 def map_contains_key(col, key) = Column.invoke("map_contains_key", _col(col), lit(key))

#map_entries(*cols) ⇒ `Column`

The Spark SQL map_entries function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_filter(col, &block) ⇒ `Object`

252	# File 'lib/spark_connect/functions.rb', line 252 def map_filter(col, &block) = Column.invoke("map_filter", _col(col), _lambda(block))

#map_from_arrays(keys, values) ⇒ `Column`

Returns a map from two array columns (keys, values).

Returns:

(Column) —
a map from two array columns (keys, values).



100
101

# File 'lib/spark_connect/functions.rb', line 100

def map_from_arrays(keys, values) = Column.invoke("map_from_arrays", _col(keys), _col(values))
# @return [Column] a named struct from alternating name/value arguments.

#map_from_entries(*cols) ⇒ `Column`

The Spark SQL map_from_entries function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_keys(*cols) ⇒ `Column`

The Spark SQL map_keys function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_values(*cols) ⇒ `Column`

The Spark SQL map_values function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_zip_with(c1, c2, &block) ⇒ `Object`

253	# File 'lib/spark_connect/functions.rb', line 253 def map_zip_with(c1, c2, &block) = Column.invoke("map_zip_with", _col(c1), _col(c2), _lambda(block))

#max(*cols) ⇒ `Column`

The Spark SQL max function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#max_by(*cols) ⇒ `Column`

The Spark SQL max_by function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#md5(*cols) ⇒ `Column`

The Spark SQL md5 function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#mean(*cols) ⇒ `Column`

The Spark SQL mean function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#median(*cols) ⇒ `Column`

The Spark SQL median function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#min(*cols) ⇒ `Column`

The Spark SQL min function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#min_by(*cols) ⇒ `Column`

The Spark SQL min_by function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#minute(*cols) ⇒ `Column`

The Spark SQL minute function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#mode(*cols) ⇒ `Column`

The Spark SQL mode function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#monotonically_increasing_id ⇒ `Column`

The Spark SQL monotonically_increasing_id function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#month(*cols) ⇒ `Column`

The Spark SQL month function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#months_between(d1, d2, round_off = true) ⇒ `Object`

160	# File 'lib/spark_connect/functions.rb', line 160 def months_between(d1, d2, round_off = true) = Column.invoke("months_between", _col(d1), _col(d2), lit(round_off))

#named_struct(*cols) ⇒ `Column`

Returns a named struct from alternating name/value arguments.

Returns:

(Column) —
a named struct from alternating name/value arguments.

102	# File 'lib/spark_connect/functions.rb', line 102 def named_struct(cols) = Column.invoke("named_struct", cols.map { \|c\| _col(c) })

#nanvl(col1, col2) ⇒ `Column`

Returns col1 if it is not NaN, else col2.

Returns:

(Column) —
col1 if it is not NaN, else col2.

89	# File 'lib/spark_connect/functions.rb', line 89 def nanvl(col1, col2) = Column.invoke("nanvl", _col(col1), _col(col2))

#negate(*cols) ⇒ `Column`

The Spark SQL negate function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#negative(*cols) ⇒ `Column`

The Spark SQL negative function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#next_day(col, day_of_week) ⇒ `Object`

161	# File 'lib/spark_connect/functions.rb', line 161 def next_day(col, day_of_week) = Column.invoke("next_day", _col(col), lit(day_of_week))

#now ⇒ `Column`

The Spark SQL now function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#nth_value(col, offset, ignore_nulls = false) ⇒ `Object`

228	# File 'lib/spark_connect/functions.rb', line 228 def nth_value(col, offset, ignore_nulls = false) = Column.invoke("nth_value", _col(col), lit(offset), lit(ignore_nulls))

#ntile(n) ⇒ `Object`

227	# File 'lib/spark_connect/functions.rb', line 227 def ntile(n) = Column.invoke("ntile", lit(n))

#octet_length(*cols) ⇒ `Column`

The Spark SQL octet_length function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#overlay(col, replace, pos, len = -1)) ⇒ `Column`

Returns overlay replace into col at pos for len chars.

Returns:

(Column) —
overlay replace into col at pos for len chars.



141
142

# File 'lib/spark_connect/functions.rb', line 141

def overlay(col, replace, pos, len = -1) = Column.invoke("overlay", _col(col), _col(replace), lit(pos), lit(len))
# @return [Column] SHA-2 hash with the given bit length (224/256/384/512).

#percent_rank ⇒ `Column`

The Spark SQL percent_rank function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#pmod(*cols) ⇒ `Column`

The Spark SQL pmod function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#posexplode(*cols) ⇒ `Column`

The Spark SQL posexplode function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#posexplode_outer(*cols) ⇒ `Column`

The Spark SQL posexplode_outer function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#positive(*cols) ⇒ `Column`

The Spark SQL positive function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#pow(*cols) ⇒ `Column`

The Spark SQL pow function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#power(*cols) ⇒ `Column`

The Spark SQL power function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#product(*cols) ⇒ `Column`

The Spark SQL product function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#quarter(*cols) ⇒ `Column`

The Spark SQL quarter function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#radians(*cols) ⇒ `Column`

The Spark SQL radians function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#rand(seed = nil) ⇒ `Object`

---- Randomness --------------------------------------------------------

236	# File 'lib/spark_connect/functions.rb', line 236 def rand(seed = nil) = seed.nil? ? Column.invoke("rand") : Column.invoke("rand", lit(seed))

#randn(seed = nil) ⇒ `Object`

237	# File 'lib/spark_connect/functions.rb', line 237 def randn(seed = nil) = seed.nil? ? Column.invoke("randn") : Column.invoke("randn", lit(seed))

#rank ⇒ `Column`

The Spark SQL rank function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#regexp_count(col, pattern) ⇒ `Object`

138	# File 'lib/spark_connect/functions.rb', line 138 def regexp_count(col, pattern) = Column.invoke("regexp_count", _col(col), lit(pattern))

#regexp_extract(col, pattern, idx = 0) ⇒ `Column`

Returns the idx-th group of pattern matched in col.

Returns:

(Column) —
the idx-th group of pattern matched in col.



131
132

# File 'lib/spark_connect/functions.rb', line 131

def regexp_extract(col, pattern, idx = 0) = Column.invoke("regexp_extract", _col(col), lit(pattern), lit(idx))
# @return [Column] all matches of group `idx` of `pattern`.

#regexp_extract_all(col, pattern, idx = 1) ⇒ `Column`

Returns all matches of group idx of pattern.

Returns:

(Column) —
all matches of group idx of pattern.



133
134

# File 'lib/spark_connect/functions.rb', line 133

def regexp_extract_all(col, pattern, idx = 1) = Column.invoke("regexp_extract_all", _col(col), lit(pattern), lit(idx))
# @return [Column] `col` with `pattern` replaced by `replacement`.

#regexp_like(col, pattern) ⇒ `Column`

Returns whether col matches pattern.

Returns:

(Column) —
whether col matches pattern.

137	# File 'lib/spark_connect/functions.rb', line 137 def regexp_like(col, pattern) = Column.invoke("regexp_like", _col(col), lit(pattern))

#regexp_replace(col, pattern, replacement) ⇒ `Column`

Returns col with pattern replaced by replacement.

Returns:

(Column) —
col with pattern replaced by replacement.



135
136

# File 'lib/spark_connect/functions.rb', line 135

def regexp_replace(col, pattern, replacement) = Column.invoke("regexp_replace", _col(col), lit(pattern), lit(replacement))
# @return [Column] whether `col` matches `pattern`.

#regexp_substr(col, pattern) ⇒ `Object`



139
140

# File 'lib/spark_connect/functions.rb', line 139

def regexp_substr(col, pattern) = Column.invoke("regexp_substr", _col(col), lit(pattern))
# @return [Column] overlay `replace` into `col` at `pos` for `len` chars.

#repeat(col, n) ⇒ `Column`

Returns the string repeated n times.

Returns:

(Column) —
the string repeated n times.



125
126

# File 'lib/spark_connect/functions.rb', line 125

def repeat(col, n) = Column.invoke("repeat", _col(col), lit(n))
# @return [Column] split `col` by the literal regex `pattern`.

#reverse(*cols) ⇒ `Column`

The Spark SQL reverse function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#rint(*cols) ⇒ `Column`

The Spark SQL rint function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#round(col, scale = 0) ⇒ `Column`

Returns HALF_UP rounding to scale decimal places.

Returns:

(Column) —
HALF_UP rounding to scale decimal places.



80
81

# File 'lib/spark_connect/functions.rb', line 80

def round(col, scale = 0) = Column.invoke("round", _col(col), lit(scale))
# @return [Column] HALF_EVEN ("banker's") rounding to `scale` places.

#row_number ⇒ `Column`

The Spark SQL row_number function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#rpad(col, len, pad) ⇒ `Column`

Returns right-padded string.

Returns:

(Column) —
right-padded string.



123
124

# File 'lib/spark_connect/functions.rb', line 123

def rpad(col, len, pad) = Column.invoke("rpad", _col(col), lit(len), lit(pad))
# @return [Column] the string repeated `n` times.

#rtrim(*cols) ⇒ `Column`

The Spark SQL rtrim function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#schema_of_json(json, options = {}) ⇒ `Object`



191
192
193

# File 'lib/spark_connect/functions.rb', line 191

def schema_of_json(json, options = {})
  Column.invoke("schema_of_json", _lit_or_col(json), *options.flat_map { |k, v| [lit(k.to_s), lit(v.to_s)] })
end

#sec(*cols) ⇒ `Column`

The Spark SQL sec function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#second(*cols) ⇒ `Column`

The Spark SQL second function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sequence(start, stop, step = nil) ⇒ `Object`



217
218
219

# File 'lib/spark_connect/functions.rb', line 217

def sequence(start, stop, step = nil)
  step.nil? ? Column.invoke("sequence", _col(start), _col(stop)) : Column.invoke("sequence", _col(start), _col(stop), _col(step))
end

#sha(*cols) ⇒ `Column`

The Spark SQL sha function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sha1(*cols) ⇒ `Column`

The Spark SQL sha1 function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sha2(col, num_bits) ⇒ `Column`

Returns SHA-2 hash with the given bit length (224/256/384/512).

Returns:

(Column) —
SHA-2 hash with the given bit length (224/256/384/512).



143
144

# File 'lib/spark_connect/functions.rb', line 143

def sha2(col, num_bits) = Column.invoke("sha2", _col(col), lit(num_bits))
# @return [Column] convert a number string from `from_base` to `to_base`.

#shiftleft(col, num_bits) ⇒ `Column`

Returns left shift / right shift by literal bit counts.

Returns:

(Column) —
left shift / right shift by literal bit counts.

147	# File 'lib/spark_connect/functions.rb', line 147 def shiftleft(col, num_bits) = Column.invoke("shiftleft", _col(col), lit(num_bits))

#shiftright(col, num_bits) ⇒ `Object`

148	# File 'lib/spark_connect/functions.rb', line 148 def shiftright(col, num_bits) = Column.invoke("shiftright", _col(col), lit(num_bits))

#shiftrightunsigned(col, num_bits) ⇒ `Object`

149	# File 'lib/spark_connect/functions.rb', line 149 def shiftrightunsigned(col, num_bits) = Column.invoke("shiftrightunsigned", _col(col), lit(num_bits))

#shuffle(*cols) ⇒ `Column`

The Spark SQL shuffle function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#signum(*cols) ⇒ `Column`

The Spark SQL signum function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sin(*cols) ⇒ `Column`

The Spark SQL sin function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sinh(*cols) ⇒ `Column`

The Spark SQL sinh function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#size(*cols) ⇒ `Column`

The Spark SQL size function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#skewness(*cols) ⇒ `Column`

The Spark SQL skewness function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#slice(col, start, length) ⇒ `Object`

215	# File 'lib/spark_connect/functions.rb', line 215 def slice(col, start, length) = Column.invoke("slice", _col(col), _lit_or_col(start), _lit_or_col(length))

#some(*cols) ⇒ `Column`

The Spark SQL some function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sort_array(col, asc = true) ⇒ `Object`

---- Sorting helpers ---------------------------------------------------

232	# File 'lib/spark_connect/functions.rb', line 232 def sort_array(col, asc = true) = Column.invoke("sort_array", _col(col), lit(asc))

#soundex(*cols) ⇒ `Column`

The Spark SQL soundex function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#spark_partition_id ⇒ `Column`

The Spark SQL spark_partition_id function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#split(col, pattern, limit = -1)) ⇒ `Column`

Returns split col by the literal regex pattern.

Returns:

(Column) —
split col by the literal regex pattern.



127
128

# File 'lib/spark_connect/functions.rb', line 127

def split(col, pattern, limit = -1) = Column.invoke("split", _col(col), lit(pattern), lit(limit))
# @return [Column] characters of `col` matching `matching` replaced per `replace`.

#sqrt(*cols) ⇒ `Column`

The Spark SQL sqrt function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#stddev(*cols) ⇒ `Column`

The Spark SQL stddev function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#stddev_pop(*cols) ⇒ `Column`

The Spark SQL stddev_pop function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#stddev_samp(*cols) ⇒ `Column`

The Spark SQL stddev_samp function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#struct(*cols) ⇒ `Column`

Returns a struct from the given columns.

Returns:

(Column) —
a struct from the given columns.



94
95

# File 'lib/spark_connect/functions.rb', line 94

def struct(*cols) = Column.invoke("struct", *cols.map { |c| _col(c) })
# @return [Column] an array from the given columns.

#substring(col, pos, len) ⇒ `Column`

Returns substring of length len from 1-based pos.

Returns:

(Column) —
substring of length len from 1-based pos.



113
114

# File 'lib/spark_connect/functions.rb', line 113

def substring(col, pos, len) = Column.invoke("substring", _col(col), lit(pos), lit(len))
# @return [Column] substring before the `count`-th occurrence of `delim`.

#substring_index(col, delim, count) ⇒ `Column`

Returns substring before the count-th occurrence of delim.

Returns:

(Column) —
substring before the count-th occurrence of delim.



115
116

# File 'lib/spark_connect/functions.rb', line 115

def substring_index(col, delim, count) = Column.invoke("substring_index", _col(col), lit(delim), lit(count))
# @return [Column] 1-based position of literal `substr` within `col` (0 if absent).

#sum(*cols) ⇒ `Column`

The Spark SQL sum function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sum_distinct(col) ⇒ `Column`

Returns sum of distinct values.

Returns:

(Column) —
sum of distinct values.

75	# File 'lib/spark_connect/functions.rb', line 75 def sum_distinct(col) = Column.invoke("sum", _col(col), is_distinct: true)

#tan(*cols) ⇒ `Column`

The Spark SQL tan function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#tanh(*cols) ⇒ `Column`

The Spark SQL tanh function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#timestamp_micros(*cols) ⇒ `Column`

The Spark SQL timestamp_micros function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#timestamp_millis(*cols) ⇒ `Column`

The Spark SQL timestamp_millis function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#timestamp_seconds(*cols) ⇒ `Column`

The Spark SQL timestamp_seconds function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#to_date(col, fmt = nil) ⇒ `Object`

154	# File 'lib/spark_connect/functions.rb', line 154 def to_date(col, fmt = nil) = fmt ? Column.invoke("to_date", _col(col), lit(fmt)) : Column.invoke("to_date", _col(col))

#to_json(col, options = {}) ⇒ `Object`

# File 'lib/spark_connect/functions.rb', line 186

def to_json(col, options = {})
  args = [_col(col)] + options.flat_map { |k, v| [lit(k.to_s), lit(v.to_s)] }
  Column.invoke("to_json", *args)
end

#to_timestamp(col, fmt = nil) ⇒ `Object`

155	# File 'lib/spark_connect/functions.rb', line 155 def to_timestamp(col, fmt = nil) = fmt ? Column.invoke("to_timestamp", _col(col), lit(fmt)) : Column.invoke("to_timestamp", _col(col))

#to_utc_timestamp(col, tz) ⇒ `Object`

171	# File 'lib/spark_connect/functions.rb', line 171 def to_utc_timestamp(col, tz) = Column.invoke("to_utc_timestamp", _col(col), lit(tz))

#transform(col) {|element| ... } ⇒ `Column`

Transform each element of an array. The block receives a Column (and optionally the index) and returns a Column.

Yield Parameters:

element (Column)

Returns:

(Column)

245	# File 'lib/spark_connect/functions.rb', line 245 def transform(col, &block) = Column.invoke("transform", _col(col), _lambda(block))

#transform_keys(col, &block) ⇒ `Object`

250	# File 'lib/spark_connect/functions.rb', line 250 def transform_keys(col, &block) = Column.invoke("transform_keys", _col(col), _lambda(block))

#transform_values(col, &block) ⇒ `Object`

251	# File 'lib/spark_connect/functions.rb', line 251 def transform_values(col, &block) = Column.invoke("transform_values", _col(col), _lambda(block))

#translate(col, matching, replace) ⇒ `Column`

Returns characters of col matching matching replaced per replace.

Returns:

(Column) —
characters of col matching matching replaced per replace.



129
130

# File 'lib/spark_connect/functions.rb', line 129

def translate(col, matching, replace) = Column.invoke("translate", _col(col), lit(matching), lit(replace))
# @return [Column] the `idx`-th group of `pattern` matched in `col`.

#trim(*cols) ⇒ `Column`

The Spark SQL trim function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#trunc(col, fmt) ⇒ `Object`

162	# File 'lib/spark_connect/functions.rb', line 162 def trunc(col, fmt) = Column.invoke("trunc", _col(col), lit(fmt))

#typeof(*cols) ⇒ `Column`

The Spark SQL typeof function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#ucase(*cols) ⇒ `Column`

The Spark SQL ucase function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#udf ⇒ `Object`

UDFs require a server-side execution environment (Python/Scala) and are not supported by the pure-Ruby client.

Raises:

(NotImplementedError)



273
274
275

# File 'lib/spark_connect/functions.rb', line 273

def udf(*)
  raise NotImplementedError, "User-defined functions are not supported by the Ruby Spark Connect client"
end

#unbase64(*cols) ⇒ `Column`

The Spark SQL unbase64 function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unhex(*cols) ⇒ `Column`

The Spark SQL unhex function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_date(*cols) ⇒ `Column`

The Spark SQL unix_date function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_micros(*cols) ⇒ `Column`

The Spark SQL unix_micros function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_millis(*cols) ⇒ `Column`

The Spark SQL unix_millis function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_seconds(*cols) ⇒ `Column`

The Spark SQL unix_seconds function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_timestamp(col = nil, fmt = "yyyy-MM-dd HH:mm:ss") ⇒ `Object`



166
167
168

# File 'lib/spark_connect/functions.rb', line 166

def unix_timestamp(col = nil, fmt = "yyyy-MM-dd HH:mm:ss")
  col.nil? ? Column.invoke("unix_timestamp") : Column.invoke("unix_timestamp", _col(col), lit(fmt))
end

#upper(*cols) ⇒ `Column`

The Spark SQL upper function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#uuid ⇒ `Column`

The Spark SQL uuid function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#var_pop(*cols) ⇒ `Column`

The Spark SQL var_pop function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#var_samp(*cols) ⇒ `Column`

The Spark SQL var_samp function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#variance(*cols) ⇒ `Column`

The Spark SQL variance function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#version ⇒ `Column`

The Spark SQL version function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#weekday(*cols) ⇒ `Column`

The Spark SQL weekday function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#weekofyear(*cols) ⇒ `Column`

The Spark SQL weekofyear function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#when(condition, value) ⇒ `Column`

Start a CASE WHEN expression. Chain Column#when / Column#otherwise.

Returns:

(Column)



51
52
53

# File 'lib/spark_connect/functions.rb', line 51

def when(condition, value)
  Column.invoke("when", condition, value)
end

#xxhash64(*cols) ⇒ `Column`

The Spark SQL xxhash64 function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#year(*cols) ⇒ `Column`

The Spark SQL year function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#zip_with(left, right, &block) ⇒ `Object`

249	# File 'lib/spark_connect/functions.rb', line 249 def zip_with(left, right, &block) = Column.invoke("zip_with", _col(left), _col(right), _lambda(block))

Module: SparkConnect::Functions

Overview

Examples:

Constant Summary collapse

Class Attribute Summary collapse

Instance Method Summary collapse

Class Attribute Details

.lambda_counter ⇒ Object

Instance Method Details

#_col(value) ⇒ Object

#_lambda(block) ⇒ Object

#_lit_or_col(value) ⇒ Object

#abs(*cols) ⇒ Column

#acos(*cols) ⇒ Column

#acosh(*cols) ⇒ Column

#add_months(col, months) ⇒ Object

#aggregate(col, initial, merge, finish = nil) ⇒ Column

#any_value(*cols) ⇒ Column

#approx_count_distinct(col, rsd = nil) ⇒ Column

#array(*cols) ⇒ Column

#array_append(col, value) ⇒ Object

#array_compact(*cols) ⇒ Column

#array_contains(col, value) ⇒ Object

#array_distinct(*cols) ⇒ Column

#array_except(*cols) ⇒ Column

#array_insert(col, pos, value) ⇒ Object

#array_intersect(*cols) ⇒ Column

#array_join(col, delimiter, null_replacement = nil) ⇒ Object

#array_max(*cols) ⇒ Column

#array_min(*cols) ⇒ Column

#array_position(col, value) ⇒ Object

#array_prepend(col, value) ⇒ Object

#array_remove(col, element) ⇒ Object

#array_repeat(col, count) ⇒ Object

#array_sort(*cols) ⇒ Column

#array_union(*cols) ⇒ Column

#arrays_overlap(*cols) ⇒ Column

#arrays_zip(*cols) ⇒ Column

#asc(col) ⇒ Column

#asc_nulls_first(col) ⇒ Object

#asc_nulls_last(col) ⇒ Object

#ascii(*cols) ⇒ Column

#asin(*cols) ⇒ Column

#asinh(*cols) ⇒ Column

#atan(*cols) ⇒ Column

#atan2(*cols) ⇒ Column

#atanh(*cols) ⇒ Column

#avg(*cols) ⇒ Column

#base64(*cols) ⇒ Column

#bin(*cols) ⇒ Column

#bit_and(*cols) ⇒ Column

#bit_count(*cols) ⇒ Column

#bit_length(*cols) ⇒ Column

#bit_or(*cols) ⇒ Column

#bit_xor(*cols) ⇒ Column

#bitwise_not(*cols) ⇒ Column

#bool_and(*cols) ⇒ Column

#bool_or(*cols) ⇒ Column

#broadcast(df) ⇒ DataFrame

#bround(col, scale = 0) ⇒ Column

#cardinality(*cols) ⇒ Column

#cbrt(*cols) ⇒ Column

#ceil(*cols) ⇒ Column

#ceiling(*cols) ⇒ Column

#char_length(*cols) ⇒ Column

#character_length(*cols) ⇒ Column

#coalesce(*cols) ⇒ Column

#col(name) ⇒ Column Also known as: column

#collect_list(*cols) ⇒ Column

#collect_set(*cols) ⇒ Column

#concat(*cols) ⇒ Column

#concat_ws(sep, *cols) ⇒ Column

#conv(col, from_base, to_base) ⇒ Column

#corr(*cols) ⇒ Column

#cos(*cols) ⇒ Column

#cosh(*cols) ⇒ Column

#cot(*cols) ⇒ Column

#count(col) ⇒ Column

#count_distinct(*cols) ⇒ Column Also known as: countDistinct

#count_if(*cols) ⇒ Column

.lambda_counter ⇒ `Object`

#_col(value) ⇒ `Object`

#_lambda(block) ⇒ `Object`

#_lit_or_col(value) ⇒ `Object`

#abs(*cols) ⇒ `Column`

#acos(*cols) ⇒ `Column`

#acosh(*cols) ⇒ `Column`

#add_months(col, months) ⇒ `Object`

#aggregate(col, initial, merge, finish = nil) ⇒ `Column`

#any_value(*cols) ⇒ `Column`

#approx_count_distinct(col, rsd = nil) ⇒ `Column`

#array(*cols) ⇒ `Column`

#array_append(col, value) ⇒ `Object`

#array_compact(*cols) ⇒ `Column`

#array_contains(col, value) ⇒ `Object`

#array_distinct(*cols) ⇒ `Column`

#array_except(*cols) ⇒ `Column`

#array_insert(col, pos, value) ⇒ `Object`

#array_intersect(*cols) ⇒ `Column`

#array_join(col, delimiter, null_replacement = nil) ⇒ `Object`

#array_max(*cols) ⇒ `Column`

#array_min(*cols) ⇒ `Column`

#array_position(col, value) ⇒ `Object`

#array_prepend(col, value) ⇒ `Object`

#array_remove(col, element) ⇒ `Object`

#array_repeat(col, count) ⇒ `Object`

#array_sort(*cols) ⇒ `Column`

#array_union(*cols) ⇒ `Column`

#arrays_overlap(*cols) ⇒ `Column`

#arrays_zip(*cols) ⇒ `Column`

#asc(col) ⇒ `Column`

#asc_nulls_first(col) ⇒ `Object`

#asc_nulls_last(col) ⇒ `Object`

#ascii(*cols) ⇒ `Column`

#asin(*cols) ⇒ `Column`

#asinh(*cols) ⇒ `Column`

#atan(*cols) ⇒ `Column`

#atan2(*cols) ⇒ `Column`

#atanh(*cols) ⇒ `Column`

#avg(*cols) ⇒ `Column`

#base64(*cols) ⇒ `Column`

#bin(*cols) ⇒ `Column`

#bit_and(*cols) ⇒ `Column`

#bit_count(*cols) ⇒ `Column`

#bit_length(*cols) ⇒ `Column`

#bit_or(*cols) ⇒ `Column`

#bit_xor(*cols) ⇒ `Column`

#bitwise_not(*cols) ⇒ `Column`

#bool_and(*cols) ⇒ `Column`

#bool_or(*cols) ⇒ `Column`

#broadcast(df) ⇒ `DataFrame`

#bround(col, scale = 0) ⇒ `Column`

#cardinality(*cols) ⇒ `Column`

#cbrt(*cols) ⇒ `Column`

#ceil(*cols) ⇒ `Column`

#ceiling(*cols) ⇒ `Column`

#char_length(*cols) ⇒ `Column`

#character_length(*cols) ⇒ `Column`

#coalesce(*cols) ⇒ `Column`

#col(name) ⇒ `Column` Also known as: column

#collect_list(*cols) ⇒ `Column`

#collect_set(*cols) ⇒ `Column`

#concat(*cols) ⇒ `Column`

#concat_ws(sep, *cols) ⇒ `Column`

#conv(col, from_base, to_base) ⇒ `Column`

#corr(*cols) ⇒ `Column`

#cos(*cols) ⇒ `Column`

#cosh(*cols) ⇒ `Column`

#cot(*cols) ⇒ `Column`

#count(col) ⇒ `Column`

#count_distinct(*cols) ⇒ `Column` Also known as: countDistinct

#count_if(*cols) ⇒ `Column`

#covar_pop(*cols) ⇒ `Column`

#covar_samp(*cols) ⇒ `Column`

#crc32(*cols) ⇒ `Column`

#create_map(*cols) ⇒ `Column`

#csc(*cols) ⇒ `Column`

#cume_dist ⇒ `Column`

#current_catalog ⇒ `Column`

#current_database ⇒ `Column`