SPARK-SQL内置函数之字符串函数--688IT编程网

SPARK-SQL内置函数之字符串函数

concat(str1, str2, ..., strN) - Returns the concatenation of str1, str2, ..., strN.

Examples:> SELECT concat('Spark', 'SQL'); SparkSQL

concat_ws(sep, [str | array(str)]+) - Returns the concatenation of the strings separated by sep.

Examples:> SELECT concat_ws(' ', 'Spark', 'SQL'); Spark SQL

3.decode转码

decode(bin, charset) - Decodes the first argument using the second argument character set.

Examples: > SELECT decode(encode('abc', 'utf-8'), 'utf-8'); abc

encode(str, charset) - Encodes the first argument using the second argument character set.

Examples: > SELECT encode('abc', 'utf-8');abc

5.format_string/printf 格式化字符串

format_string(strfmt, obj, ...) - Returns a formatted string from printf-style format strings.

Examples:> SELECT format_string("Hello World %d %s", 100, "days"); Hello World 100 days

6.initcap将每个单词的⾸字母变为⼤写，其他字母⼩写; lower全部转为⼩写，upper⼤写

initcap(str) - Returns str with the first letter of each word in uppercase. All other letters are in lowercase. Words are delimited by white space. Examples:> SELECT initcap('sPark sql'); Spark Sql

7.length返回字符串的长度

Examples:> SELECT length('Spark SQL '); 10

8.levenshtein编辑距离（将⼀个字符串变为另⼀个字符串的距离）

levenshtein(str1, str2) - Returns the Levenshtein distance between the two given strings.

Examples:> SELECT levenshtein('kitten', 'sitting'); 3

9.lpad返回固定长度的字符串，如果长度不够，⽤某种字符补全，rpad右补全

lpad(str, len, pad) - Returns str, left-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters. Examples:> SELECT lpad('hi', 5, '??'); hi

10.ltrim去除空格或去除开头的某些字符,rtrim右去除，trim两边同时去除

ltrim(str) - Removes the leading space characters from str.

ltrim(trimStr, str) - Removes the leading string contains the characters from the trim string

Examples:

> SELECT ltrim(' SparkSQL '); SparkSQL

> SELECT ltrim('Sp', 'SSparkSQLS'); arkSQLS

Examples:> SELECT regexp_extract('100-200', '(\d+)-(\d+)', 1); 100

Examples: > SELECT regexp_replace('100-200', '(\d+)', 'num'); 　num-num

Examples: > SELECT repeat('123', 2); 123123

13.instr返回截取字符串的位置/locate

instr(str, substr) - Returns the (1-based) index of the first occurrence of substr in str.

Examples:> SELECT instr('SparkSQL', 'SQL'); 6

Examples:> SELECT locate('bar', 'foobarbar'); 4

14.space 在字符串前⾯加n个空格

space(n) - Returns a string consisting of n spaces.

Examples:> SELECT concat(space(2), '1'); 1

15.split以某些字符拆分字符串

split(str, regex) - Splits str around occurrences that match regex.

Examples:> SELECT split('oneAtwoBthreeC', '[ABC]'); ["one","two","three",""]

16.substr截取字符串，substring_index

Examples:

> SELECT substr('Spark SQL', 5); k SQL

> SELECT substr('Spark SQL', -3); SQL

> SELECT substr('Spark SQL', 5, 1); k

> SELECT substring_index('', '.', 2); www.apache

Examples: > SELECT translate('AaBbCc', 'abc', '123'); A1B2C3

<_json_object

get_json_object(json_txt, path) - Extracts a json object from path.

Examples:> SELECT get_json_object('{"a":"b"}', '$.a'); b

19.unhex

unhex(expr) - Converts hexadecimal expr to binary.

Examples:> SELECT decode(unhex('537061726B2053514C'), 'UTF-8'); Spark SQL

<_json

to_json(expr[, options]) - Returns a json string with a given struct value

Examples:

字符串复制函数

> SELECT to_json(named_struct('a', 1, 'b', 2)); {"a":1,"b":2}

> SELECT to_json(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy')); {"time":"26/08/2015"} > SELECT to_json(array(named_struct('a', 1, 'b', 2)); [{"a":1,"b":2}]

> SELECT to_json(map('a', named_struct('b', 1))); {"a":{"b":1}}

> SELECT to_json(map(named_struct('a', 1),named_struct('b', 2))); {"[1]":{"b":2}}

> SELECT to_json(map('a', 1)); {"a":1}

> SELECT to_json(array((map('a', 1)))); [{"a":1}]

688IT编程网

SPARK-SQL内置函数之字符串函数

发表评论

推荐文章

应用程序的安全检测方法、装置、电子设备和存储介质

nginx map用法正则

VBA之正则表达式(1)--基础篇

Prometheus监控学习笔记之初识PromQL

关于PHP中的webshell

热门文章

m函数数字提取

jest断言方法大全

中兴ZXSEC US 管理员手册

keras系列(一):参数设置

Qt从QString中提取出数字

element input 金额千分位格式化

freemaker 参数解析正则

C#正则验证数字

form表单验证正则

scanf正则表达式用法

grafana value的正则表达式

Android平台浮点数运算应用

js-(JS正则表达式验证数字)

判断Python输入是否是整数,字符,或浮点数

c语言 sscanf 正则规则

从文本中提取数值技巧

js将整数转换成两位浮点数的方法

vue正则限制浮点数

8到20的结尾的正则

shell 正则表达式最后一行

最新文章

应用程序的安全检测方法、装置、电子设备和存储介质

VBA之正则表达式(1)--基础篇

代码编辑的辅助方法、装置及电子设备

SHELL查字符串中包含字符的命令

String方法中replace和replaceAll的区别详解(源码分析)

双字节符号正则

标签列表

688IT编程网

SPARK-SQL内置函数之字符串函数

发表评论

推荐文章

应用程序的安全检测方法、装置、电子设备和存储介质

nginx map用法 正则

VBA之正则表达式(1)--基础篇

Prometheus监控学习笔记之初识PromQL

关于PHP中的webshell

热门文章

m函数数字提取

jest断言方法大全

中兴ZXSEC US 管理员手册

keras系列(一):参数设置

Qt从QString中提取出数字

element input 金额千分位格式化

freemaker 参数解析正则

C#正则验证数字

form表单验证正则

scanf正则表达式用法

grafana value的正则表达式

Android平台浮点数运算应用

js-(JS正则表达式验证数字)

判断Python输入是否是整数,字符,或浮点数

c语言 sscanf 正则规则

从文本中提取数值技巧

js将整数转换成两位浮点数的方法

vue正则限制浮点数

8到20的结尾的正则

shell 正则表达式 最后一行

最新文章

应用程序的安全检测方法、装置、电子设备和存储介质

VBA之正则表达式(1)--基础篇

代码编辑的辅助方法、装置及电子设备

SHELL查字符串中包含字符的命令

String方法中replace和replaceAll的区别详解(源码分析)

双字节符号正则

标签列表

nginx map用法正则

shell 正则表达式最后一行