Category Archives: Regular Expressions

Trending up: Windows7, Agile Methodologies, Scrum, Python. Everything else? Down!


Linked-in are now listing ala-carte qualifications which one can endorse one’s acquaintances on, or be endorsed by them. No surprise, 20 people say I’m good at “hardware”, which is my highest endorsement. What I want to draw your attention to here is that if you hover your pointer over each of the possible qualifications, Linked-in will show you a working definition and the year-to-year trend on people who say they do-have-know-practice-are-qualified-in the specific item.

Not so surprising, people saying they know ‘hardware’ are down year on year… also C, C++, software engineering, Perforce, customer support, regression test, unit test, and so forth. VMware is 0% – neither up nor down over last year.

Agile Methodologies are up, Windows 7 is way-up, Python is up, Scrum is up. The other 44 categories, on my list, not including VMware at 0 and Windows 8 which doesn’t have a year on year trend, are down.

So, among people who list qualifications similar to mine, the majority and growth area are Python users, on Windows 7, employing Scrum and Agile project management methods.

Your choice whether that’s:

a) what everyone wants;

b) what people looking for work think they need;

c) some cross section of professionals on Linked-in.

I think its worth noting in passing, but not worth a lot of study. But it is a curiosity.

Searching for apples and oranges, using grep.


Not using grep? Its a step up from sorting and cutting and pasting in spreadsheets. You will feel, briefly, omniscient, when you use it to solve some problem that’s been bugging you. Here’s my latest:

You care about two keywords in a file- apples and oranges, and you also care about about their relative positions, for whatever reason. So grepping for each, separately, is nice, but you’d really like to grep for one OR the other.

Did I mention this was grep?

grep -i ‘apple\|orange’ *filename.ext*

The -i makes it case-insensitive, just like you’d want on a first pass. The “|” vertical bar is a familiar OR operator, and the only tricky parts are to a) put the whole thing in a single set of single quotes- the two words and the operator are a single syntactic unit, and b) use a back-slash to mark the vertical bar as an operator and not just a literal vertical bar.

I used apple and orange in the title because they are canonically “unrelated” things, but where this technique is really useful is when the unrelated things are in orthagonal kinds: fruits and deserts. If you’ve got your recipes filed or a cookbook on line, grep -i ‘pie\|apple’ will produce all the refernces to either. Pies involving apples will be found where ‘apple’ has ‘pie’ both above and below… As a human, you have a right to do that last bit in your head, the sorting out that we gatherer-hunters are bred for.

Using Java’s String.split() to divide a string delimited by whitespace AND other things.


I have a file containing lines that look like:

#### #.##% #.##% ## ###### someLongStringAtTheEnd
__ (leading blanks
where #### is an integer number and ##.## is a floating point number. The two floats are labeled as percentages by the literal percent signs.

As far as spaces go, .split( ” ” ) works just fine. If you want more than one ‘whitespace” you have two choices: Write your own RegExp naming each whitespace character after double back slashes to escape them from the Java compiler AND Java runtime:

…split( “[\\ \\t]+” );

catches both spaces and tabs…

However, the world of Regular Expressions offers something even neater: \s, which stands for all Ascii whitespace.

Of course, you still have to escape it
…split( “[\\s]” )

and add the percent sign, and a “+” outside the braces to allow it to take one or more of the specified characters:
…split( “[\\s%]+” )

Here is a nice discussion of whitespace delimiters, larger-than-ascii character sets, etc.:
http://stackoverflow.com/questions/1822772/java-regular-expression-to-match-all-whitespace-characters

Bill