One approach to delete the initial zeros from a string/varchar in Spark Scala is to use the regexp_replace
function. Here's an example:
import org.apache.spark.sql.functions._
val df = Seq("000123", "004567", "010987", "987654").toDF("num_string")
df.show()
// Output:
// +---------+
// |num_string|
// +---------+
// | 000123|
// | 004567|
// | 010987|
// | 987654|
// +---------+
val df2 = df.withColumn("num_string_trimmed", regexp_replace($"num_string", "^0*", ""))
df2.show()
// Output:
// +---------+----------------+
// |num_string|num_string_trimmed|
// +---------+----------------+
// | 000123| 123|
// | 004567| 4567|
// | 010987| 10987|
// | 987654| 987654|
// +---------+----------------+
In this example, we use the regular expression ^0*
to match zero or more occurrences of the digit 0 at the beginning of the string. The ^
character is used to anchor the match at the start of the string. The *
character means "zero or more." The regexp_replace
function replaces these matches with an empty string (""
), effectively removing them from the start of the string. The resulting DataFrame has a new column called num_string_trimmed
with the initial zeros removed from each value in the original num_string
column.
Please start posting anonymously - your entry will be published after you log in or create a new account. This space is reserved only for answers. If you would like to engage in a discussion, please instead post a comment under the question or an answer that you would like to discuss
Asked: 2022-03-11 11:00:00 +0000
Seen: 9 times
Last updated: Jan 12 '22
What does "waiting for handler commit" mean in relation to the slow writes experienced in MySQL 8?
What is the difference between indexing in Elasticsearch and MongoDB?
What is the procedure for testing the entire application API in .NET?
How can PostgreSQL notifications be utilized to simplify the project infrastructure?
How can DBT be used to incrementally update the model for Postgres database?
In SCSS, what is the method for grouping and reusing a set of classes and styles?