pyspark.sql.functions.validate_utf8#

pyspark.sql.functions.validate_utf8(str)[source]#

Returns the input value if it corresponds to a valid UTF-8 string, or emits an error otherwise.

New in version 4.0.0.

Parameters
strColumn or column name

A column of strings, each representing a UTF-8 byte sequence.

Returns
Column

the input string if it is a valid UTF-8 string, error otherwise.

Examples

>>> import pyspark.sql.functions as sf
>>> spark.range(1).select(sf.validate_utf8(sf.lit("SparkSQL"))).show()
+-----------------------+
|validate_utf8(SparkSQL)|
+-----------------------+
|               SparkSQL|
+-----------------------+