Jump to content

Using RXReplace to Remove Duplicates in String


S H

Recommended Posts

Hello,

I have a data set with a country column that contains several comma separated lists. I want to do a uniqueconcatenate across this column, but because the column can already contain multiple values (i.e. a single record might have "Canada, United States") I am not getting the output I want. My uniqueconcatenation comes out as:

"Canada, Canada, India, Canada, Israel, Canada, United States, India, United States, United States"

I know that an RXReplace could help me here, but I am beyond a beginner at these. Cribbing from the internet, the best I got was:

RXReplace(UniqueConcatenate([COUNTRY]),"([^,]+)(,[ ]*\1)+","\1","")

Which only gets me halfway there with an output of "Canada, India, Canada, Israel, Canada, United States, India, United States, United States".

Any thoughts on how this could be made to work

(And at the same time, any favourite resources for learning regular expressions)

Thank you in advance!

Link to comment
Share on other sites

Hi

I am not sure I understand the desired result, it would be great if you couldclarify with a sample dataset showing the desired result.

I am interpreting it this way:

- you have data in which each row is a different concatenation of countries. Example:

COUNTRY

A,A,B

B,B,C,A

D,A,D,A

C,B,B,B,B

-you want a new column, that for each row contains the sameunique concatenation of all the countries in all rows.

You could achieve it by creating a calculated column like this: (type all in one row)

TERR_String("output

Link to comment
Share on other sites

Hello,Yes, you have the right idea.  In this case I need this in a calculated value, but the expression would be the same as doing it in a calculated column I believe.  The only difference between your sample data and the real data is that for a given record a country will only appear once.  So if the data set is:COUNTRY

A, B

A

B, C

B, D, E

D

D, EMy desired output of the calculated value is A, B, C, D, E.  Simply doing a uniqueconcatenate doesn't work since it looks at the whole string and not subsets.Ideally I would prefer to use regular expressions if possible, as our company does not have a license that permits TERR to be used in webplayer.Appreciate your help!

Link to comment
Share on other sites

Hi

I think you googled what I have now googled. From what I see, the problems are 1) the uniqueConcatenate adds a space which is non-avoidable, 2) the proposed solution does not work if the repeated countries are not adjacent.

Here is something that worked for the example, although it is a bit of a pastiche. Unfortunately I do not have the resources to clean it or rationalise it further.

RXReplace(Trim(RXReplace(Concatenate(",",RXReplace(UniqueConcatenate([COUNTRY]),"\s+","","g")),"(.)(=.*\1)"," ","g")),"\s+",",","g)

The inner coreRXReplace(UniqueConcatenate([COUNTRY]),"\s+","","g) does the concatenation, then removes every inner space.

The basic idea is to apply a positive lookahead (.)(=.*\1)

The rest is garnish to replace spurious spaces and commas

Gaia

Link to comment
Share on other sites

I expect you are correct and we landed on the same reference page.  And yes, the issue with that solution is that the concatenated list is not sorted and so the solution only removes adjacent duplicate values.Unfortunately when I entered the below solution, the output I get is:Input: RXReplace(Trim(RXReplace(Concatenate(",",RXReplace(UniqueConcatenate([COUNTRY]),"\s+","","g")),"(.)(=.*\1)"," ","g")),"\s+",",","g")

Output: "r,l,C,I,,Uni,dS,ates"In the above, UniqueConcatenate([COUNTRY]) returns the string "Canada, Canada, India, Canada, Israel, Canada, United States, India, United States, United States".Did the above work for you  Could SF versions be at play here (I am running v7.7)Thank you again for your help!

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...