cleaning data (array related)

For general discussion related FlowStone
Post Reply
tester
Posts: 1786
Joined: Wed Jan 18, 2012 10:52 pm
Location: Poland, internet

cleaning data (array related)

Post by tester »

I can't find it, so I'm not sure whether I posted this question earlier, or just started to write the post, and closed the browser in the middle. Mixed memories about arrays confuse me even more.

I guess this is again ruby question. Cleaning data from non-numeric characters, spaces and so on. While it's not a big deal with anything after numers (standard green prims can do this by default), I don't know how to remove anything prior do that data.

Let say that I have an array written like this:

11asd,12bcv3,crd13, 14,15

A real mess, but it should end like this:

11,12,13,14,15

So all non-numeric characters (except comma - yep, this time comma) prior to first numeric substring (at index), and after it - are removed.

The cleaning has to do with possible user interactions with data, that are entered manually. I used harsh example; usually it will be all sort of typos.
Need to take a break? I have something right for you.
Feel free to donate. Thank you for your contribution.
User avatar
Nubeat7
Posts: 1347
Joined: Sat Apr 14, 2012 9:59 am
Location: Vienna
Contact:

Re: cleaning data (array related)

Post by Nubeat7 »

it is string related in your example and you can use .gsub for it

Code: Select all

output @in.gsub(/[A-z]/, '')

problem here is that there are no special signs included like "!","?",.... will still be there, and only all alphabeticals are deleted.. would need some search in regex
tester
Posts: 1786
Joined: Wed Jan 18, 2012 10:52 pm
Location: Poland, internet

Re: cleaning data (array related)

Post by tester »

Thanks, but there is small thing. What about non-letter non-numeric characters? Like /.:'\ and so on? From what I see, adding items to [] part - uses them as filter (so it works for some signs), but there are some limitations, like slashes /\ and few others. Is there a pattern definition, that removes all except numbers (and optionally commas; this part can be eventually remultiplexed via green prims)?

Or different way - "pass through only numbers and commas" instead of "remove all except numbers and commas"?

//edit: it looks that it can go through a parser.
Last edited by tester on Fri Nov 08, 2013 11:50 pm, edited 1 time in total.
Need to take a break? I have something right for you.
Feel free to donate. Thank you for your contribution.
User avatar
Nubeat7
Posts: 1347
Joined: Sat Apr 14, 2012 9:59 am
Location: Vienna
Contact:

Re: cleaning data (array related)

Post by Nubeat7 »

oh got it, instead of matchevery character in the brackets you can do a ^ befor so it matches every character NOT in the brackets

Code: Select all

output @in.gsub(/[^0-9,]/, '')


http://www.tutorialspoint.com/ruby/ruby ... ssions.htm
tester
Posts: 1786
Joined: Wed Jan 18, 2012 10:52 pm
Location: Poland, internet

Re: cleaning data (array related)

Post by tester »

I see, thanks, will use yours.
Meanwhile I did this one (but seems to not work correctly). :-)
Attachments
custom-parser.fsm
(473 Bytes) Downloaded 1018 times
Need to take a break? I have something right for you.
Feel free to donate. Thank you for your contribution.
billv
Posts: 1165
Joined: Tue Aug 31, 2010 3:34 pm
Location: Australia
Contact:

Re: cleaning data (array related)

Post by billv »

tester wrote:What about non-letter non-numeric characters? Like /.:'\ and so on?

Delete special characters with...

Code: Select all

output @str.gsub("\/","")

or

Code: Select all

output @str.gsub("\"","")
tester
Posts: 1786
Joined: Wed Jan 18, 2012 10:52 pm
Location: Poland, internet

Re: cleaning data (array related)

Post by tester »

I think what Nubeat7 proposed would be better here, because it limits the choice to showing numbers and selected characters instead of removing the "unknown party".
Need to take a break? I have something right for you.
Feel free to donate. Thank you for your contribution.
billv
Posts: 1165
Joined: Tue Aug 31, 2010 3:34 pm
Location: Australia
Contact:

Re: cleaning data (array related)

Post by billv »

Yes...I'm just addressing the special character removal.
I imagine you would probably map them all together next.
tester
Posts: 1786
Joined: Wed Jan 18, 2012 10:52 pm
Location: Poland, internet

Re: cleaning data (array related)

Post by tester »

Sure, I understand. Just pointed it to others who read.
Need to take a break? I have something right for you.
Feel free to donate. Thank you for your contribution.
Post Reply