How to use java to convert UNIX epoch columns to dates in Apache spark dataframe?

2020-08-26 • Java

I have a JSON data file that contains an attribute [creationdate], which is a UNIX EPOC of "long" numeric type The Apache spark dataframe architecture is as follows:

root 
 |-- creationDate: long (nullable = true) 
 |-- id: long (nullable = true) 
 |-- postTypeId: long (nullable = true)
 |-- tags: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- title: string (nullable = true)
 |-- viewCount: long (nullable = true)

I want to do some groupby "creationdata_year", which needs to be obtained from "creationdate"

What is the easiest way to do this in a dataframe using Java?

Solution

After checking the spark dataframe API and SQL functions, I say in the following fragment:

DateFrame df = sqlContext.read().json("MY_JSON_DATA_FILE");

DataFrame df_DateConverted = df.withColumn("creationDt",from_unixtime(stackoverflow_Tags.col("creationDate").divide(1000)));

The reason why the "creationdate" column is divided by "1000" is that the timeunit is different Orgin "creationdate" is the UNIX period in "milliseconds", but spark SQL "from_unixtime" is intended to handle the UNIX period in "seconds"

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.

THE END

Java

二维码

Java – HashMap can only be copied through hashcode()

< <上一篇

Java – use the original string in the modified mail request body

下一篇>>

搜索内容

How to use java to convert UNIX epoch columns to dates in Apache spark dataframe?

Solution

热门文章