# Category Archives: How

## *Fortunate* Motorcyclist survives driving off cliff

http://www.cnn.com/videos/us/2017/08/11/motorcycle-plunges-off-cliff-santa-monica-mountains-california-orig-trnd-lab.cnn/video/playlists/caught-on-camera/

Cliff-diving motorcyclist Matthew Murray, 27, passes a “25 MPH” advisory sign in the 12th second of CNN’s video clip. This is in the 2nd run through of the crash video. In the 15th second he’s going 68 MPH as he starts to lean into the turn. He’s still going more than 50 MPH as he slides off the pavement and onto the dirt. Text on the screen says something to the effect that he “was following the turn when he thinks his steering locked up”. The video shows no such thing. He was going too fast, and could not turn sharply enough to follow the turn. He started at more than 2.5 times the advised speed. He left the pavement at 2 times the advised speed. His speed “locked” his path, not his steering.

Get the an accurate map of the curve, the size and tread pattern of the motorcycle tires and a description of the motorcycle (make, model, horsepower, brakes,weight-as-crashed) and rider (weight). Give to “Mythbusters”. Have them duplicate the failure, during deceleration, then do a binary search for the steady speed at which a motorcycle on those tires, at that weight, could follow that turn. Braking uses traction, does that change maximum speed?. Find the entry speed, before braking, that would allow the bike to make the turn. Put a GoPro on the bike for comparison pictures, and a second one showing where the front tire touches the road.

## A Software Tester’s journey from manual to political tester

I wrote this some years ago. I should simplify the context and incorporate what I reference from the OP and other responders, so that it stands alone. But this has  meaningful observations which took effort to reach, so I’m putting a copy up here to start with.

Wow, no exaggeration! I can see every event that befell poor Jim happening in the real world. HOWEVER, Jim’s a fortunate fellow, he has management attention at all! AND they look at results. AND there is a perception (no matter how shakily based) of overall product quality.

Jim was no worse than anyone else until he got automation started and mistook his personal satisfaction and enjoyment for the company’s obvious goal of shipping a stable or improving level of quality with fast turnaround on bugs and needed enhancements. This is engineering, not art. Its not self actualization, its a commercial business or a service enterprise which creates value.

All the way down this sad story, Jim accepts product failures, and testing failures. You Can Never Ignore Failures. Period. He should have turned political at that point and realized that Test, like anything, needs to be sold, shown to be valuable and productive, and needs allies. Therefore, tests need to actually be valuable and productive, and needs to make it easy for people to accept them, adopt them, and feel they are important support in their own success. Therefore he needed to measure success, as understood by his customers (developers, support, users) and maintain or improve its integrated value. Accepting failures leads to dead astronauts, wasted billions, wrongful convictions, Senate Select Committees, Frontline specials, sub-Redits, and worse.

Instead of seeing failures as a very, very, high priority, Jim turns into a man with a solution, wandering around, looking for a way to apply it. A tawdry tale, rendered no less tawdry by its oft retelling. Not insignificantly, Jim’s manager is clearly a weak and ineffective character who should have seen problems coming, or reacted when they appeared. Once Jim had made the case for automation, they might have hired someone who knew something about automation, or contracted with very carefully defined goals.

Jim might have split his team up front. He needed manual testers, who carried on the work that had been being done, with as much success as possible, and brain power applied to improve results and lower cost. A front line to hold success. Then a test automation group who focused on test automation with clear and obvious benefits

The automation environment needed to be something:

• …anyone could run;
• … which worked from a shippable product, as well as a release candidate or development build;
• …which could be triggered directly from a product build, so the build group-and-release group ran it every time;
• …which could be configured to run anything from a single, new, test to all existing tests
• in a developer’s environment, before check-in, or
• at any subsequent point, including on a previously shipped release with a support issue.

Setting up the test environment, creating a test to get the product to say “Hello world”, and recognizing that as a test pass ought to take no more than an hour longer than simply setting up the product. That assumption has to be proved every release or two with a calibrated innocent new-hire from somewhere.

Since all tests start by installing the product, license, etc, and starting it, the first thing to automate would be that. If there were changes in that functionality, over product history, the automation could start with the newest, but it had to support them all. Having this ‘smoke test’ be part of a full build would pay dividends to everyone, everywhere, and by designing with backward compatibility and future adaptability, thrash could be minimized, or at least recognized.

This would be a good time to look through the bugbase to determine where the most bugs were being found, where the most escapes were happening, and where the most critical bugs were coming from. Besides looking backward, a forward look at the product roadmap and with developers and management could highlight areas of future leverage

In parallel with automation, all of the above should be considered when selecting manual tests. Tests which never fail should be reduced relative to tests which find failures. Something that fails for diverse reasons may be a great canary in the coal mine, or might be a too fragile sponge that soaks up maintenance effort. In any event, continual improvement of the manual testing should run in parallel with introducing automation. After a small number of releases, the manual tests available should exceed the resources to run them. Selection of the ‘vital few’ should in intentional, not happenstance.

Most people can see the limitation of record and play back, so things should never stop there. The only tools worth looking at are tools that can be edited and rapidly expanded by iteration. Cut and paste is the lowest form of iteration and rapidly grows nightmares. Algorithmic test point generation is desirable, but data driven testing should get strong consideration. Algorithmic generation of literal tables which are then applied as tests separates the thing that needs to be done over and over, running the test, from the thing which is done less frequently, generating the test points.

In my life, I’ve seen a few of the failures in Jim’s story, but a lot of failures of usability by others, or by anyone, complete lack of testing in development plans. Test suites (with running tests) abandoned and no-longer run, until the next great hope get started. And far too little curiosity about which tests should be run, automatically or manually, to get the most bang for the buck.

Like I said, Jim is lucky!

View in discussion

## Software Test Methods, Levels, quiz question answers

Quiz questions about software test. My answers are probably longer than was hoped for, but specific, and most important, true and demonstrable.

1) What is the difference between functional testing and system testing?

2) What are the different testing methodologies?

1) System test is the equivalent of actual customers/users using the product. Carried out as if in the real world, with a range of detailed configurations, simulation of typical users working in typical way. It is one level of abstraction above Functional testing. Functional Test verifies that the product will do functions which it is intended to do. Play, rewind, stop, pause, fast forward. +, -, x, /, =.  Functional Tests must be drawn from the Requirements documents. System Test checks that a product which meets those requirements can be operated in the real world to solve real problems. Put another way, System test proves that the requirements selected for the product are correct.

This makes one wonder why engineers don’t do system test on the requirements before creating the design and code… mostly because its hard to do, and they’re sure they understand what the requirements should be, I suppose. I’ve never seen it done in depth.

2) “the different testing methodologies” seems over-determined. The following are ‘some’ different testing methods. There may be others.

Perhaps the intent of the question is to expose a world divided into White Box and Black Box testing, which are different from each other. But there are other dichotomies, in addition to White Box and Black Box.

Software testing methods divide into two large classes, Static and Dynamic. Static testing looks at source code, dynamic testing requires executable programs and runs them. Another division is between Using a Tool that evaluates source code and and Checking Program Output. Within either set of large groups are smaller divisions, Black Box and White Box (and Clear Box and Gray Box) are all divisions of Dynamic or Checking Output methods.  Specific methods within the large groups include

• running source code through a compiler
• running a stress test that consumes all of a given resource on the host
• running a tool that looks for memory allocation and access errors
• doing a clean install on a customer-like system and then running customer-like activities and checking their output for correctness.

Orthagonal to all of the above, Manual Test and Automated Test are infastructure-based distinctions, Automated tests may be Black Box, Unit, running a tool, checking output, or any other methodology. Manual and Automated are meta-methods.

Static Software Test Methods: Similar to, but not exactly the same as Tool Using Methods, to find problems in software source code.

2.1) Compile successfully, no errors or warnings. This is the first step before inspection, since nothing is better or cheaper at finding compiler problems than the compiler.

2.2) Inspection and code review, to see if the code is written to the standards that the organization enforces. I like and use code reviews, the formal Fagan system, and less formal “extreme programming” techniques like having a second person review all diffs or do a walk through with two people at the workstation. They work. The standards inspected for are usually helpful in preventing bugs or making them visible. Just looking usually improves product quality – the Western Electric effect if nothing else.

There may be some insight into product requirements and how the code meets them in a review. But the reviewers would need to know the requirements and the design of the software in some detail. Its difficult enough to get the code itself to be read. In Engineering Paradise, I suppose the requirements are formally linked to design features, and features to data and code that operates on that data, to create the feature.

2.3) Static analysis. Besides passing compiler checks without errors or warnings, there are static analysis tools, “lint” for example, that can inspect code for consistency with best practices and deterministic operation. Coverity, and others, have commercial products that do static test on source code.

Dynamic Software Test Methods:

2.5) Memory access / leakage software test. Rational/IBM’s Purify, like ValGrind and BoundsChecker, run an instrumented copy of the source code under test to see memory problems in a dynamic environment. Its run and the results should be checked and responded to before a large investment in further  Dynamic testing should happen.

2.6) Performance test. Measuring resources consumed, obviously time, possibly others, during repeatable, usually large-scale, operations, similar to System or Load tests. Generic data, from development testing, is necessary and may be shipped as an installation test to users. Proprietary data, under a NDA (non-disclosure agreement), may also be needed, for complex problems ans/or important customers. In normal operation, the actual outputs are not looked at, at most, spot-checked, and the tool(s) keeping track of resources are the basis of pass/fail.

2.7) Installation Test. Typically a subset of in-house performance tests, with optional, generic, data. The performance recorded is comparable between releases, instances, configurations, sites, customers, and the software maker’s own in-house performance tests. Customers can use Installation tests to verify their hardware/software environment, benchmark it, evaluate new purchases for their environment, etc.

Checking Program Output Methods:

After tool based dynamic testing, the rest of Dynamic software test is based on running the product with specific inputs and checking the outputs, in detail.

Checking can be done with with exit status, stack traces,”assert()”, exceptions, diffing large output files against ‘gold’ references, log searches, directory listings, searching for keywords in output streams indicating failure or incorrect operation, checking for expected output and no other, etc. No test failures are acceptable. Each test must be deterministic, sequence independant, and (ideally) can run automatically. No judgement required for results. All require running the program.

2.8) Unit tests of pieces of the a product, in isolation, with fake/simulated/mock resources. A great bottom-up tool for verifying software. At the unit test level is where knowledge of the code is most important to testing. It is white box/clear box, with full insight into the code under test. One explicit goal of unit test should be forcing all branches in the code to be executed. That can’t be done without allowing visibility into the code.

2.9) Integration Test. The next level above unit test, the tests of code which calls code which calls code… and the code above that! The point is that integration is where code from different groups, different companies, different points in time, certainly different engineers, comes together. Misunderstanding is always possible. Here’s one place it shows up. Visibility into the code is getting dimmer here. Some tests are more functional, if a subsystem contains complete, requirement-satisfying functions.

2.10) Functional Test. Verifying that the product will do functions which it is intended to do. Play, rewind, stop, pause, fast forward. +, -, x, /, =.  Tests here should be drawn from the Requirements documents. Things that should be tested here should start in the Requirements docs. Each requirement has to be demonstrated to have been met. Its black-box testing, run from the interface customers use, on a representative host, with no insight into the internals of the product. Unless the requirements specify low level actions.

Its not particularly combinatorial- a short program, a long program, 2+2, 1/-37. Pat head. Rub belly. Walk, Not all 3 at once.

If a word-processor has no stated limit for document size, you need to load or make a really big file, but, truly, that’s a bad spec. A practical limit of ‘n’ characters has to be agreed as the maximum size tested-to. Then you stop.

All these Tests should be drawn from the Requirements documents. Things that should be tested here should start in the Requirements docs.

All that Verification is good, but what about Validation?

Unit test,  Integration test, or Functional Test, is where Validation, proving correctness of the design, might happen. Validation test is where deep algorithms are fully exercised, broad ranges of input are fully exercised, Tests that include all possible numerals, all possible characters, all defined whitespace, read in or written out. Numbers from MinInt to MaxInt, 0 to MaxUnsigned, the full range of Unicode characters, etc., etc., are exercised.

(Errors in input numbers should be seen in System test anyway, but accepting a wide range goes here.) This is not always done very formally, because most modern code environments don’t need it. But someone ought to look at least once.

L10n (Localization) and I18n (Internationalization) that need to be selected at link time or run time can be checked here too.
This is also where path-length limits, IPv-6 addresses, etc. should be checked.

2.11) User interface test verifies the controls and indicators that users at various levels see, hear, touch, operate and respond to. This is separate from any actual work the program may do in response. This is a high-value target for automation, since it can be complex and tedious to do UI testing in great detail by hand.

2.12) System Test. Full up use of the system. Training, white-paper and demo/marketing examples. Real-world situations reproduced from bugs or solutions provided for customers. Unless requirements included complexity, this is where the complex tests start. Huge data. Complex operations.  The range of supported host configurations, min to max, gets tested here too.

We’ll want to see all the error messages, created every possible way. We’ll want to have canned setups on file, just like a customer would, and we pour them into the product, run it, and collect the output. The set pass/fail on the output.

Somewhere between System Test and Acceptance test, the scale of pass/fail goes up another level of abstraction. Software test pass/fail results are one in the same with the product pass / fail. If data and setup are good, it should run and pass. Ship the result. If the data and/or setup have a problem, it should run and fail. The failure should propagate out to be stored in detail, but in the end this is a trinary result. Pass, Fail, Not Proven

2.13) Load test, Stress test.  Load tests go to the point that all of a resource is consumed, and adding  more activity produces no more output in real time. Resources include CPU, memory, local storage, networked storage, video memory, USB ports, maximum number of users, maximum number of jobs, maximum instances of product, etc. Stress test adds data, jobs, etc, clearly (110% or more) above load test maximum.

2.14) Stability test. Long term test. Stability test and long-term test are where a server or set of servers are started and left running, doing real work, for days, weeks, months. Some of the tests must repeat inputs and expect identical outputs each time.  Resource consumption should be checked. Its fair for the application or tool to have the node to itself, but adding other applications and unrelated users here and in the Load/Stress tests is meaningful, to avoid surprises from the field.

2.15) Acceptance test.  Customer sets-up their run-time world use of the system and uses it. Everything they would normally do. If its a repeat sale, they may just clone the previous installation. Run the previous and the new system, release, patch, etc, and compare output to installed software on machines that customer likes and trusts. If the product is a new one, acceptance means judging pass-fail from the output produced.

Many other kinds of test are mentioned in conversation and literature. A web search will turn up dozens. Regression test, stability test, in the sense that a new code branch is stable, sanity test and smoke test are all forms of testing but usually, in my experience, consist of subsets of the test levels/methods listed above.

A Smoke test (run the product, make sure it loads and runs, like a hardware smoke test where you apply power, turn it on and see if any smoke comes out…) can be made from the first steps of several different methods/levels named above. If the Smoke test is more than simply running the program once, then it should probably be some part of one of the other methods/levels. Or to put it another way, the work that goes into setting up the smoke test should be shared/captured. There might be a ..test/smoke/… directory, but the contents should be copied from somewhere else.

A Sanity test, a Stability test and Regression tests are successively larger swaths, at lower and lower levels, of the System, Performance, User Interface, Functional, etc. tests. They should be specified and are not embarrassing, but their content should be drawn from or reflected by those larger level-based tests. The should not be original and alone.

What do you think?

## Searching for apples and oranges, using grep.

Not using grep? Its a step up from sorting and cutting and pasting in spreadsheets. You will feel, briefly, omniscient, when you use it to solve some problem that’s been bugging you. Here’s my latest:

You care about two keywords in a file- apples and oranges, and you also care about about their relative positions, for whatever reason. So grepping for each, separately, is nice, but you’d really like to grep for one OR the other.

Did I mention this was grep?

grep -i ‘apple\|orange’ *filename.ext*

The -i makes it case-insensitive, just like you’d want on a first pass. The “|” vertical bar is a familiar OR operator, and the only tricky parts are to a) put the whole thing in a single set of single quotes- the two words and the operator are a single syntactic unit, and b) use a back-slash to mark the vertical bar as an operator and not just a literal vertical bar.

I used apple and orange in the title because they are canonically “unrelated” things, but where this technique is really useful is when the unrelated things are in orthagonal kinds: fruits and deserts. If you’ve got your recipes filed or a cookbook on line, grep -i ‘pie\|apple’ will produce all the refernces to either. Pies involving apples will be found where ‘apple’ has ‘pie’ both above and below… As a human, you have a right to do that last bit in your head, the sorting out that we gatherer-hunters are bred for.

## Recursion III, using Java

Here’s a Java example of the classic recursion/tree/web traversal. Note that this uses an array of other nodes, not just Left and Right, so you can make n-link webs as well as proper trees using this example.

The method interMed() is used to avoid having to make the data structures static- there is much I have yet to learn about Java! But I can manage this, and that feels pretty good!

```/* recurse.java```

``` Recursive tree/web traversal, in Java. No explicit pointers, so I state that the array of pointers to nodes is an array of nodes, and treat it as such. Is it really pointers if I believe it is but can't see them? Its like String Theory... Original 11:58am 4/21 Bill4 */ import java.util.ArrayList; public class recurse { public class node { String name; ArrayList kids; }; public class leveledName { String name; int level; }; ArrayList nLs = new ArrayList (); public void recur ( node n, int level ) { // System.out.println( "recur " + n + " level " + level ); leveledName lN = new leveledName(); lN.name = n.name; lN.level= level; nLs.add( lN ); for ( int i = 0; i < n.kids.size(); i++ ) { // better way to do this? recur( n.kids.get(i), level + 1 ); } // for int i... } // recur function public void interMed() { node Q = new node(); Q.name = "Q"; Q.kids = new ArrayList (); // System.out.println( Q ); node S = new node(); S.name = "S"; S.kids = new ArrayList (); // System.out.println( S ); node T = new node(); T.name = "T"; T.kids = new ArrayList (); // System.out.println( T ); node N = new node(); N.name = "N"; N.kids = new ArrayList (); N.kids.add(Q); // System.out.println( N ); node P = new node(); P.name = "P"; P.kids = new ArrayList (); P.kids.add(S); P.kids.add(T); // System.out.println( P ); node M = new node(); M.name="M"; M.kids = new ArrayList (); M.kids.add(N); M.kids.add(P); // System.out.println( M ); recur( M, 0 ); int maxLevel = 0; for ( int i = 0; i maxLevel) { maxLevel = ((nLs.get(i)).level ); } // if ... } // for ... for (int j = maxLevel; j > -1; j-- ) { // System.out.println( "j " + j ); for ( int i = 0; i < nLs.size(); i++ ) { // System.out.println( "j " + j + " i " + i ); if ( j == (nLs.get(i)).level ) { System.out.printf( "%s %d \n", (nLs.get(i)).name, (nLs.get(i)).level ); } } // for i... } // for j... } // fn interMed public static void main ( String args[] ) { recurse r = new recurse(); r.interMed(); } // main ```

```} // class recurse ```

## Recursion II, K&R C, worked out in advance

Earlier I posted the C++ solution to a tree/web traversal programming problem. Here’s the C solution, including a vector-like array for pointers to children, so one doesn’t have to hard code left, right, etc. In this case Max Children is 5 but it can be any number.  A sample output is included below

/* recursion.C */
/* Follow-up to _ recursion problem, web prowling question at _  */

/* input:
(M)
|   \
(N)  (P)
|  \    \
(Q) (S)  (T)

(3 level b tree, M has two kids, N and P, and N has two kids, Q and S.  P has one child – T.)

output:
Q, S, T, N, P, M

*/

#include <stdio.h>
#include <stdlib.h>

#define MAX_NAME_N_LEVEL 1000
#define MAX_KIDS  5

/* Structure in which the input data arrives: */

struct node {
char name;  struct node *(kids[MAX_KIDS]);
};

/* Structure the result vector (array) is built from: */
struct nameNLevel  {
char name;   int level;
};

/* Global scope variables for putting struct node + name data, as discovered in recursive part. */

struct nameNLevel* nsNLs[ MAX_NAME_N_LEVEL ];
int nmLvlCount = 0;

/*
* Synopsis:  void recur( int level, struct node* n ) {
* args:
*    int level
*    struct node* n
* returns: void, BUT puts a record into nsNLs[] and increments nmLvlCount.
* The record contins a node name and the level it was found at.
* Apr 5, 2011  Bill Abbott
*/

void recur( int level, struct node* n ) {
/* first make the new record in the list of names and levels */

struct nameNLevel* thisNmNLvl = (struct nameNLevel*) malloc( sizeof( struct nameNLevel));    /* allocate name string & level num struct */
if (0 == thisNmNLvl ) { /* allocation failed! */
printf(“Memory allocation failed at level %d, struct node %s, go ahead and crash!\n”, level, n->name );
}

thisNmNLvl->level = level;            /* fill in level, */
if ( n != 0 ) thisNmNLvl->name = n->name;             /* 1 char name… */
nsNLs[ nmLvlCount++ ] = thisNmNLvl;

printf(“\n”);
printf(“recur level: %d    n: 0x%x   name: %c\n”, level,  n, n->name );
/*
printf(“(long)*(n > kids)  :  0X%x \n”, (long)*(n->kids) );
printf(“(long)(n > kids[0]):  0X%x \n”, (long)(n->kids[0]) );
*/
/* those two should be the same… */

if ( 0 != n->kids ) {  /* this pointer should always have an array where it points, but just in case… */

int j;
for (j=0; j<3; j++ ) {
printf(” (n > kids[%d]) = 0x%x  “, j, (n->kids[j]) );
if ( n->kids[j] ) { printf(”   >name = %c\n”, (n->kids[j])->name ); }
else { printf( “\n” ); }
}   /* ha! This was the hardest part… */
}

int i;
/* now look for any child nodes an call recursivly for them… */
for ( i = 0; n->kids[i] != 0; i++ ) {
recur(level+1,  n->kids[i]);
} /* for int it… */

} /* recur */

/*
* Synopsis: void passThrough( struct node* n )
* args:
* returns:
* no return value. creates and outputs vector of node names,
* “highest” level first, in ascending order of child vector contents..
* Mar 27, 2011  Bill Abbott
*/

void passThrough( struct node* n ) {

int i;
for( i = 0; i< MAX_NAME_N_LEVEL; i++ ) { /* not strictly required…*/
nsNLs[ i ] = 0;  // set ’em all to null to start with.
} /* for i… */

int level = 0;
nmLvlCount = 0;

recur( level, n );

int maxLevel = 0;
for (i = 0; i < nmLvlCount; i++ ) {
if ( nsNLs[ i ]->level > maxLevel ) {
maxLevel = nsNLs[ i ]->level;
} /* if…*/
} /* for int i… */

/*    printf(“\nlevel  %d    nmLvlCount  %d     maxLevel %d \n”, level, nmLvlCount, maxLevel ); */

int lvl;
printf(“\n”);
for ( lvl = maxLevel; lvl >= 0; lvl– ) {  // this is serious, collect and print, all done.
for ( i = 0; i < nmLvlCount; i++ ) {
if (nsNLs[i]->level == lvl ) {
printf( “%c, “,  nsNLs[i]->name );
}
}
} /* for int i… */
printf(“\n”);

for ( lvl = maxLevel; lvl >= 0; lvl– ) {  // this is serious, collect and print, all done.
for ( i = 0; i < nmLvlCount; i++ ) {
if (nsNLs[i]->level == lvl ) {
printf( “%d, “, nsNLs[i]->level );
}
}
} /* for int i… */
printf(“\n”);

} /* passThrough */

/*
* Synopsis: int main( int argc, char* argv[] )
* args:
* int        argc    count of command line arguments
* char*    argv[]    vector of zero-terminated arrays of char containing command line args
* returns:
* no return value. creates a tree of nodes, outputs vector of node names,
* “highest” level first, in ascending order of child vector contents..
* Apr 7, 2011  Bill Abbott
*/

int main( int argc, char* argv[] ) {

/* 3 level b tree:
* M has two kids, N and P, and
*    N has two kids, Q and S.
*        Q has no kids
*        S has no kids
*    P has one child – T.
*        T has no kids
*/

char nameIt[] =”malloc “;
char theRest[] = ” failed. Out of memory\n”;

struct node* T = malloc( sizeof(struct node));
if ( 0 == T ) { printf(“%s T %s”, nameIt, theRest ); return( 0 ); }
T->name = ‘T’;
T->kids[MAX_KIDS] = (struct node*) malloc(sizeof(struct node*) * MAX_KIDS);
if ( 0 == T->kids ) { printf(“%s T->kids %s”, nameIt, theRest); return( 0 ); }
T->kids[0] = (void*) 0;
T->kids[1] = (void*) 0;
T->kids[2] = (void*) 0;

/*
printf(“\n”);
printf(“(long)(T > kids) = 0X%x   \n”,  (long)*(T->kids) );
if ( (T->kids[0]))  printf(“*(T > kids[0]) = %c\n”,   *(T->kids[0]) );
if ( (T->kids[0]))  printf(“( (T > kids[0]) >name = 0x%x  %c\n”,   (T->kids[0])->name, (T->kids[0])->name );
if ( (T->kids[1]))  printf(“( (T > kids[1]) >name = 0x%x  %c\n”,   (T->kids[1])->name, (T->kids[1])->name );
if ( (T->kids[2]))  printf(“( (T > kids[2]) = 0x%x\n”,   (T->kids[2]) );
*/

struct node* S = malloc( sizeof(struct node));
if ( 0 == S ) { printf(“%s S %s”, nameIt, theRest ); return( 0 ); }
S->name = ‘S’;
S->kids[MAX_KIDS] = malloc( sizeof( struct node*) * MAX_KIDS);
if ( 0 == S->kids ) { printf(“%s S->kids %s”, nameIt, theRest); return( 0 ); }
S->kids[0] = (void*) 0;
S->kids[1] = (void*) 0;
S->kids[2] = (void*) 0;

struct node* Q = malloc( sizeof(struct node));
if ( 0 == Q ) { printf(“%s Q %s”, nameIt, theRest ); return( 0 ); }
Q->name = ‘Q’;
*(Q->kids) = malloc(sizeof(struct node*) * MAX_KIDS);
if ( 0 == Q->kids ) { printf(“%s Q->kids %s”, nameIt, theRest); return( 0 ); }
Q->kids[0] = (void*) 0;
Q->kids[1] = (void*) 0;
Q->kids[2] = (void*) 0;

struct node* P = malloc( sizeof(struct node));
if ( 0 == P ) { printf(“%s P %s”, nameIt, theRest ); return( 0 ); }
P->name = ‘P’;
P->kids[MAX_KIDS] = malloc(sizeof(struct node*) * MAX_KIDS );
if ( 0 == P->kids ) { printf(“%s P->kids %s”, nameIt, theRest); return( 0 ); }
P->kids[0] = T;
P->kids[1] = (void*) 0;
P->kids[2] = (void*) 0;

struct node* N = malloc( sizeof(struct node));
if ( 0 == N ) { printf(“%s N %s”, nameIt, theRest ); return( 0 ); }
N->name = ‘N’;
N->kids[MAX_KIDS] = malloc(sizeof(struct node*) * MAX_KIDS );
if ( 0 == N->kids ) { printf(“%s N->kids %s”, nameIt, theRest); return( 0 ); }
N->kids[0] = Q;
N->kids[1] = S;
N->kids[2] = (void*) 0;

struct node* M  = malloc( sizeof(struct node));
if ( 0 == N ) { printf(“%s N %s”, nameIt, theRest ); return( 0 ); }
M->name = ‘M’;
M->kids[MAX_KIDS] = malloc(sizeof(struct node*) * MAX_KIDS );
if ( 0 == M->kids ) { printf(“%s M->kids %s”, nameIt, theRest); return( 0 ); }
M->kids[0] = N;
M->kids[1] = P;
M->kids[2] = (void*) 0;

/*  printf(“\n”);
printf(“(long)(M > kids) = 0X%x   \n”,  (long)*(M->kids) );
printf(“*(M > kids[0]) = %c\n”,   *(M->kids[0]) );
printf(“( (M > kids[0]) >name = 0x%x\n”,   (M->kids[0])->name );
printf(“( (M > kids[1]) >name = 0x%x\n”,   (M->kids[1])->name );
printf(“( (M > kids[2]) = 0x%x\n”,   (M->kids[2]) );
*/

passThrough( M );

return( 1 );

} // main…

Macintosh-6:interview Bill4\$ cc recursion.c
Macintosh-6:interview Bill4\$ a.out

recur level: 0    n: 0x100260   name: M
(n > kids[0]) = 0x100220     >name = N
(n > kids[1]) = 0x1001e0     >name = P
(n > kids[2]) = 0x0

recur level: 1    n: 0x100220   name: N
(n > kids[0]) = 0x1001a0     >name = Q
(n > kids[1]) = 0x100160     >name = S
(n > kids[2]) = 0x0

recur level: 2    n: 0x1001a0   name: Q
(n > kids[0]) = 0x0
(n > kids[1]) = 0x0
(n > kids[2]) = 0x0

recur level: 2    n: 0x100160   name: S
(n > kids[0]) = 0x0
(n > kids[1]) = 0x0
(n > kids[2]) = 0x0

recur level: 1    n: 0x1001e0   name: P
(n > kids[0]) = 0x100120     >name = T
(n > kids[1]) = 0x0
(n > kids[2]) = 0x0

recur level: 2    n: 0x100120   name: T
(n > kids[0]) = 0x0
(n > kids[1]) = 0x0
(n > kids[2]) = 0x0

Q, S, T, N, P, M,
2, 2, 2, 1, 1, 0,
Macintosh-6:interview Bill4\$

## Recursion, working it out in advance

In a recent pair of interviews, I was asked tree data structure questions, and I flubbed the first one. Hacking at a function that received a pointer to a node and that I needed to use as the launching point for recursion, not the recursive bit itself. Not pretty. NEXT time, I managed to gather up my wits, with a bit of help from my interviewer, Jim Gill, I managed to thread the needle and there it was, a recursive operation that worked. Sho-‘nuf!  That original is at the bottom of this post- a complete, working version, follows this introduction. Its about 180 lines, with a lot of white space.

And here’s the deal. You can’t make  a sensible recursive routine by simply getting the pointer and recalling that top level routine. YES recursion needs something that takes a pointer to node, and yes,  you call it with a child of the original node. But there’s more to it. A non-recursive superstructure which supports a recursive heart.

So here’s a complete, working, version, that compiles and runs using “gcc” on Mac OS 10.5.8. It takes a bit more than what I started with. “C” and other versions will follow, as separate postings.

Macintosh-6:interview Bill4\$ cat recursion_v.cpp
/* recursion_v.cpp */
/* Follow-up to ___ recursion problem with Jim Gill, web prowling question at ____ */

/* input:
(M)
|   \
(N)  (P)
|  \    \
(Q) (S)  (T)
(3 level b tree, M has two children, N and P, and N has two children, Q and S.  P has one child – T.)

output:
Q, S, T, N, P, M

*/

#include <iostream>
#include <vector>

/* Structure in which the input data arrives: */

struct node {
std::vector <node*> children;  std::string name;
};

/* Structure the result vector (array) is built from: */
struct nameNLevel  {
std::string name;   int level;
};

/*
* Synopsis: void recur( node* n, int level, std::vector* nsNLs )
* args:
* node*    n     this node, which has a name, may have children
* int        level    number of steps down the tree
* std::vector    nsNLs   pointer to the vector of node-name, level structs we’ll want to report
* returns:
* no return value. Node name and level are put into the vector of nameLevel structs.
* Mar 27, 2011  Bill Abbott
*/

void recur( node* n, int level, std::vector <nameNLevel>* nsNLs ){

/* To collect additions, nsNLs vector must be passed as a pointer! */
/* Unlike an array in C, an std::vector isn’t passed as a pointer automatically! */
nameNLevel* nNL = new nameNLevel;
nNL->name = n->name;
nNL->level = level;
nsNLs->push_back( *nNL );

int vecIdx = nsNLs->size() – 1;

//    std::cout << nsNLs->size() << ”  ” << vecIdx << ”    “;
std::cout << (*nsNLs)[ vecIdx ].level << ”  “;
std::cout << (*nsNLs)[ vecIdx ].name << ”  “;
std::cout << (*nsNLs)[ vecIdx ].name.size() << ”  “;
std::cout << (*nsNLs)[ vecIdx ].name.c_str() << ”    “;

std::cout << (nsNLs->back()).name << ”  ” << (nsNLs->back()).level << ”  ” << n->name << ”  ” << level << ”  “;
std::cout << nsNLs->size() << ”  ” ;
std::cout << n->children.size() << ”  ” << (0 != n->children.size()) << “\n”;
if ( 0 != n->children.size()) {   /* there are children */
for (int childCount =0; childCount < ( n->children.size()); childCount++ ){
recur(  n->children[childCount], level+1, nsNLs ); /* recursive call- children & next level… */
} /* for int childCount… */
} /* if ( 0!=  */

} /* recur *

/*
* Synopsis: void passThrough( node* n )
* args:
* node*    n     this node, which has a name, may have children
* returns:
* no return value. creates and outputs vector of node names,
* “highest” level first, in ascending order of child vector contents..
* Mar 27, 2011  Bill Abbott
*/

void passThrough( node* n ) {
std::vector namesNLevels;
int level = 0;

recur( n, level, &namesNLevels );

std::cout<< namesNLevels.size()  << ”  ” << “\n”;

int maxLevel = 0;
for (int i=0; i < namesNLevels.size(); i++ ) {
std::cout<< maxLevel << ”  ” << i << “\n”;
if (namesNLevels[i].level > maxLevel) {
maxLevel = namesNLevels[i].level;
} /* if */
} /* for */

/* note: recursion isn’t getting the data in the order we want… */
/* so, order them before printing. */

for (level=maxLevel; level >= 0; level–){
for (int i=0; i < namesNLevels.size(); i++ ) {
if (namesNLevels[i].level == level){
std::cout << namesNLevels[i].name <<  ” “;
} /* if (namesNLevels…) */
} /* for i */
} /* for level */

printf( “\n”);

}

/*
* Synopsis: int main( int argc, char* argv[] )
* args:
* int        argc    count of command line arguments
* char*    argv[]    vector of zero-terminated arrays of char containing command line args
* returns:
* no return value. creates a tree of nodes, outputs vector of node names,
* “highest” level first, in ascending order of child vector contents..
* Mar 27, 2011  Bill Abbott
*/

int main( int argc, char* argv[] ) {

/* 3 level b tree:
* M has two children, N and P, and
*    N has two children, Q and S.
*        Q has no children
*        S has no children
*    P has one child – T.
*        T has no children
*/

node* T = new node;
T->children.clear();
T->name = “T”;

node* S = new node;
S->children.clear();
S->name = “S”;

node* Q = new node;
Q->children.clear();
Q->name = “Q”;

node* P = new node;
P->children.push_back(T);
P->name = “P”;

node* N = new node;
N->children.push_back(Q);
N->children.push_back(S);
N->name = “N”;

node* M = new node;
M->children.push_back(N);
M->children.push_back(P);
M->name = “M”;

passThrough( M );

return( 1 );

————- Example output, with my debug couts—————-

Macintosh-6:interview Bill4\$ c++ recursion_v.cpp
Macintosh-6:interview Bill4\$ a.out
0  M  1  M    M  0  M  0  1  2  1
1  N  1  N    N  1  N  1  2  2  1
2  Q  1  Q    Q  2  Q  2  3  0  0
2  S  1  S    S  2  S  2  4  0  0
1  P  1  P    P  1  P  1  5  1  1
2  T  1  T    T  2  T  2  6  0  0
6
0  0
0  1
1  2
2  3
2  4
2  5
Q S T N P M
Macintosh-6:interview Bill4\$

————————– Here’s how this started:——————–

Ok- it was whiteboard code and didn’t compile. I’m working on that. Staring with std::vector blahblah  which seems like some really kewl Java/C++ library good stuff. What Kernighan and Ritchie were hoping for.

input:
(M)
|     \
(N) (P)
|     \    \
(Q) (S) (T)
(3 level b tree, M has two children, N and P, and N has two children, Q and S.  P has one child – T.

output:
Q, S, T, N, P, M

typedef struct node {
std::vector <node*> children;  std::string name;
}

typedef nameNLevel {
std::string name;   std::int level;
}

void passThrough( node* n ) {
std::Array namesNLevels;
int level = 0;

recur( n, level, namesNLevels );

int maxLevel = 0;

for (int i=0; i
if (namesNLevels[i].leve l> maxLevel) {
maxLevel = namesNLevels[i].level;
} /* if */
} /* for */

/* note: recursion isn’t getting the data in the order we want… */
/* so, order them before printing. */

for (level=0; level =<maxLevel; level++){
for (i=0; i<
if (atoi(namesNLevels[i].name) == level){  printf( “%s “, namesNLevels[i].name } /* longer but more robust */
} /* for i */
} /* for level */

printf( “\n”);

}

void recur( node* n, integer level, std::Array nsNLs ){
/* its an array, gets passes as a pointer anyhow */
nsNLs.pushback( new nameNLevel( n->name, level));

if ( 0 != n->children.count()) {   /* there are children */
for (int childCount =0; childCount < ( n->children.count()); childCount++ ){
recur(  n->children[childCount], level+1, nsNLs );
/* recursive call- children & next level… */
} /* for int childCount… */
} /* if ( 0!=  */

}

A nice page at IBM regarding recursion:

http://www.ibm.com/developerworks/linux/library/l-recurs.html#

Good stuff!

## Cool interview questions

Q – What 3 Unix utilities would you take to a desert island, and why?

A: Find, Grep, hmm, gnuplot. Find and Grep are very capable, deep, and reliable. It would be easier not to have to re-do them myself, whereas sort, ls, df, even ps aren’t nearly as difficult to re-write myself.

Q: You have two strings, needle and haystack. Both are very long- haystack is at least as long as needle, could be longer. Potentially, millions of characters.

Q1: What’s the fastest way to determine whether all the characters in needle are found in haystack? Order doesn’t have to be the same, but if there are 27 ‘A’ characters in needle, we want to know whether there are at least 27 in haystack. And on through the rest of the, say, 8 bit character set.

A: Obviously, the trick here is to NOT do an N * N solution where each character in needle results in a read (or even a partial read) of haystack. Yet, one can’t avoid reading needle at least once. So: make an array with 256 elements, coresponding to character values 0 to 255. Initialize all array members to 0. Read through needle, incrementing the count for each character value when that character is encountered. For example: needle is “needle”. Character counts are:

d: 1
e: 3
l: 1
n: 1

Once needle has been completely read and counted, start reading haystack, and *decrement* the count (but stop at 0)  for each character value whenever one of those characters appears. The read through haystack ends when either all the character counters are “0”, which is success, or when haystack is exhausted AND at least one character counter is greater than zero.

This solution is on the order of 2N + some constant, much better than the N * N obvious technique.

Q2: The A1 technique requires 256 counters of 20+ bits – most likely 32 bit integers. So it requires 1K bytes. How can the job be done with the very least amount of memory, but as long a run time as it takes?

A2: 2, 32 bit integers, and a character, are enough. One integer is an index (indice) into needle or haystack, one is a counter. The character is incremented through all possible character values. For each character value, the count is zeroed, then needle is stepped through, adding one to the counter for every instance of the target character code. After needle is completely counted, the same index is used to step through the locations in haystack, and the integer in the counter is decremented (provided it started off positive) any time the target character is found. But not below 0. If any trial gets to the end of haystack with a non-zero counter value, return a fail and give up. If the counter gets to zero while decrementing, look no further, increment the character, start on the next character.

## How to set up work space (directories) for software development:

How to lay out a work space for software development:

Q: Why not use
/Users/yourNameHere/blah.blah – everything in the login directory?

A:Its messy, impractical hard to export.

Requirements for a software development workspace:

1. Makes it easiest to do things the ‘right’ (you choose ‘right’) way.
2. Source files and support stuff can easily be version-controlled: using a third party tool (Perforce, etc) or informal methods (numbered revisions in a “History” or “Rathole” or per-file subdirectory below the working directory, etc.
3. Environment neutral: Easily exported to someone else’s environment (Some people don’t value this- I like to send out examples that arrive workable and sometimes I ask others for help. If you don’t need that, by all means have a big alias file and a bunch of stuff in /bash_profile… But do consider packaging all of that stuff so you, yourself, can move from system to system, OS to OS…)
4. Scalability: One file, 100 files, 100,000 files (i.e, no bottom)

I’ve seen a bunch of somewhat thoughtlessly assembled setups at a variety of work environments. Back in the mists of time, at Lomac, in 1979, we didn’t have password protection on our 8 inch floppy environments (RT-11 for PDP-11/03, and CP/M for S-100 Z-80s). Easy stuff. Boot up and, on a build disk, there it all was. (WERE there subdirectories in RT-11 and CP/M?, Google knows…)

At Perforce, the Performance Lab used a nice organization of:

/Users/yourNameHere/- anything and everything, but no production stuff here.

Installed stuff, supported by two additions to PATH

export PATH=\$PATH:/Release/namedRelease_*:/Work/yourNameHere/namedRelease_*:.

This then became:
/Releases/namedRelease_1/usualTopLevel
/Releases/namedRelease_1/usualTopLevel/bin
/Releases/namedRelease_1/usualTopLevel/etc
/Releases/namedRelease_1/bin
/Releases/namedRelease_2/usualTopLevel
/Releases/namedRelease_2/usualTopLevel/bin
/Releases/namedRelease_2/usualTopLevel/etc
/Releases/namedRelease_2/bin

Database location:
/db/namedRelease_1/
/db/namedRelease_2/

Leading with a named release allows Path to mix and match generically named 3rd party tools. Like Source Control Management…

Project work:
/Work/yourNameHere/namedRelease_1/
/Work/yourNameHere/namedRelease_2/

and if you absolutely, positively, have to work in /Users…
/Users/yourNameHere/namedRelease_1a
/Users/yourNameHere/namedRelease_3b

What NOT to do:

The “namedRelease_*” paths looks unnecessary, up to the time you need to maintain two parallel developments. You can keep one in /Releases/usualTopLevel/… and one in /Users/yourNameHere/usualTopLevel but its all too easy to get to:
/Releases/usualTopLevel/…
/Releases_1/usualTopLevel/…
/Releases/usualTopLevel_2/…
/Users/yourNameHere/usualTopLevel/…
/Users/yourNameHere/usualTopLevel_A/…
/Users/yourNameHere_2/usualTopLevel/…

and having to hand edit these irregular paths here, there and everywhere. Ask me how I know. 🙂

And what I plan to do here at home:

/Work/Bill4/rel_1/
to start with. Its pretty light weight. just took a few minutes to set up.