pyspark.sql.functions.inline#
- pyspark.sql.functions.inline(col)[source]#
Explodes an array of structs into a table.
This function takes an input column containing an array of structs and returns a new column where each struct in the array is exploded into a separate row.
New in version 3.4.0.
- Parameters
- col
Column
or column name Input column of values to explode.
- col
- Returns
Column
Generator expression with the inline exploded result.
See also
Examples
Example 1: Using inline with a single struct array column
>>> import pyspark.sql.functions as sf >>> df = spark.sql('SELECT ARRAY(NAMED_STRUCT("a",1,"b",2), NAMED_STRUCT("a",3,"b",4)) AS a') >>> df.select('*', sf.inline(df.a)).show() +----------------+---+---+ | a| a| b| +----------------+---+---+ |[{1, 2}, {3, 4}]| 1| 2| |[{1, 2}, {3, 4}]| 3| 4| +----------------+---+---+
Example 2: Using inline with a column name
>>> import pyspark.sql.functions as sf >>> df = spark.sql('SELECT ARRAY(NAMED_STRUCT("a",1,"b",2), NAMED_STRUCT("a",3,"b",4)) AS a') >>> df.select('*', sf.inline('a')).show() +----------------+---+---+ | a| a| b| +----------------+---+---+ |[{1, 2}, {3, 4}]| 1| 2| |[{1, 2}, {3, 4}]| 3| 4| +----------------+---+---+
Example 3: Using inline with an alias
>>> import pyspark.sql.functions as sf >>> df = spark.sql('SELECT ARRAY(NAMED_STRUCT("a",1,"b",2), NAMED_STRUCT("a",3,"b",4)) AS a') >>> df.select('*', sf.inline('a').alias("c1", "c2")).show() +----------------+---+---+ | a| c1| c2| +----------------+---+---+ |[{1, 2}, {3, 4}]| 1| 2| |[{1, 2}, {3, 4}]| 3| 4| +----------------+---+---+
Example 4: Using inline with multiple struct array columns
>>> import pyspark.sql.functions as sf >>> df = spark.sql('SELECT ARRAY(NAMED_STRUCT("a",1,"b",2), NAMED_STRUCT("a",3,"b",4)) AS a1, ARRAY(NAMED_STRUCT("c",5,"d",6), NAMED_STRUCT("c",7,"d",8)) AS a2') >>> df.select( ... '*', sf.inline('a1') ... ).select('*', sf.inline('a2')).show() +----------------+----------------+---+---+---+---+ | a1| a2| a| b| c| d| +----------------+----------------+---+---+---+---+ |[{1, 2}, {3, 4}]|[{5, 6}, {7, 8}]| 1| 2| 5| 6| |[{1, 2}, {3, 4}]|[{5, 6}, {7, 8}]| 1| 2| 7| 8| |[{1, 2}, {3, 4}]|[{5, 6}, {7, 8}]| 3| 4| 5| 6| |[{1, 2}, {3, 4}]|[{5, 6}, {7, 8}]| 3| 4| 7| 8| +----------------+----------------+---+---+---+---+
Example 5: Using inline with a nested struct array column
>>> import pyspark.sql.functions as sf >>> df = spark.sql('SELECT NAMED_STRUCT("a",1,"b",2,"c",ARRAY(NAMED_STRUCT("c",3,"d",4), NAMED_STRUCT("c",5,"d",6))) AS s') >>> df.select('*', sf.inline('s.c')).show(truncate=False) +------------------------+---+---+ |s |c |d | +------------------------+---+---+ |{1, 2, [{3, 4}, {5, 6}]}|3 |4 | |{1, 2, [{3, 4}, {5, 6}]}|5 |6 | +------------------------+---+---+
Example 6: Using inline with a column containing: array continaing null, empty array and null
>>> from pyspark.sql import functions as sf >>> df = spark.sql('SELECT * FROM VALUES (1,ARRAY(NAMED_STRUCT("a",1,"b",2), NULL, NAMED_STRUCT("a",3,"b",4))), (2,ARRAY()), (3,NULL) AS t(i,s)') >>> df.show(truncate=False) +---+----------------------+ |i |s | +---+----------------------+ |1 |[{1, 2}, NULL, {3, 4}]| |2 |[] | |3 |NULL | +---+----------------------+
>>> df.select('*', sf.inline('s')).show(truncate=False) +---+----------------------+----+----+ |i |s |a |b | +---+----------------------+----+----+ |1 |[{1, 2}, NULL, {3, 4}]|1 |2 | |1 |[{1, 2}, NULL, {3, 4}]|NULL|NULL| |1 |[{1, 2}, NULL, {3, 4}]|3 |4 | +---+----------------------+----+----+