pyspark.sql.functions.datediff#

pyspark.sql.functions.datediff(end, start)[source]#

Returns the number of days from start to end.

New in version 1.5.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
endColumn or column name

to date column to work on.

startColumn or column name

from date column to work on.

Returns
Column

difference in days between two dates.

Examples

>>> import pyspark.sql.functions as sf
>>> df = spark.createDataFrame([('2015-04-08','2015-05-10')], ['d1', 'd2'])
>>> df.select('*', sf.datediff('d1', 'd2')).show()
+----------+----------+----------------+
|        d1|        d2|datediff(d1, d2)|
+----------+----------+----------------+
|2015-04-08|2015-05-10|             -32|
+----------+----------+----------------+
>>> df.select('*', sf.datediff(df.d2, df.d1)).show()
+----------+----------+----------------+
|        d1|        d2|datediff(d2, d1)|
+----------+----------+----------------+
|2015-04-08|2015-05-10|              32|
+----------+----------+----------------+