Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

One way to remove whitespace exclusively between numbers and non-numbers in Redshift is by using regular expressions and the REGEXP_REPLACE() function.

For example, let's say we have a column called phone_number that contains values like 123 456 7890 and 555-555-1212. We want to remove any whitespace that appears only between numbers and leave any other whitespace intact.

Here's how we can do it:

SELECT REGEXP_REPLACE(phone_number, '([0-9]+)\s+([^\s0-9])', '\1\2') AS cleaned_phone_number
FROM table_name;

Explanation:

  • REGEXP_REPLACE() function searches for a pattern in the phone_number column and replaces it with a specified string.
  • The regular expression pattern '([0-9]+)\s+([^\s0-9])' matches any sequence of one or more digits ([0-9]+) followed by one or more whitespace characters (\s+) that are immediately followed by a non-digit character ([^\s0-9]).
  • The parentheses ( ) capture the matched digits and non-digit characters as two separate groups, which we refer to in the replacement string as \1 and \2, respectively.
  • The replacement string '\1\2' replaces the matched pattern with the two captured groups concatenated without the whitespace in between.

Note that this regular expression pattern assumes that there is always at least one non-digit character in the phone number, such as a hyphen, to act as a delimiter. If some phone numbers don't have a non-digit delimiter, the pattern would need to be adjusted accordingly.