pink cylinders

More than Four

...there's an axis for that.

Installing VirtualBox 4.3 on Windows 7-64 - Solved

I was preparing for an Oracle online workshop on Database 12c multi-tenancy, and as part of the prep, you get to download a VM image with the lab environment. Of course, being Oracle, it is a Virtual Box vm.

Now, I like Virtual Box. It works great on my Mac at home, but for some reason I've had problems installing 4.x on my work laptop (Lenovo W520, getting on a bit now but still recommended).

The problem is Virtual Box almost completely installs - until it gets a certain point, then the progress bar runs backwards and it says "rolling back install", followed by:

"Oracle VM VirtualBox 4.3.20 Setup Wizard ended prematurely" because of "an error". This is followed up by "Installation failed! Fatal error during installation.".

No other information provided. Running as Adminstrator after a cold reboot didn't help.

Well... I really wanted to have the Workshop lab environment available, so despite the fact that last time I encountered this problem I didn't find any useful help on the web, I tried again... and this time, I found something on the forums, a post by user Tronmech:

I'll repeat the instructions here in case for some reason that post vanishes:

  • Go into the registry at: HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Control\Network
  • Change "MaxNumFilters" from 8 to 20 (decimal)
  • Reboot
  • Install 4.3.x (run installer as administrator, just in case)

This time, the install was successful. Thanks, Tronmech!

After some more research, it seems that Windows 7 has a maximum hard-coded limit of 14 network filter drivers... and for some reason, this setting in the registry reduces it further to 8, by default.

So setting the MaxNumFilters key to 20 probably only lifts the artificial limit and allows 14 possible filters. Applications such as Virtual Machine managers and VPN clients need to add filters to the network stack, and increasing this limit in the registry seems to be the recommended fix.

In one KB article, I read that a Windows Service Pack could not be installed until some filters were removed!

To view currently deployed Network Filter Drivers, right-click on your connection widget in the Network Connections control panel applet, and view properties:

I believe that each of those checkboxed items is a "Network Filter Driver". I have 9 of them, and I assume that the "VirtualBox Bridged Networking" driver was the 9th one added after I lifted the limit of 8.

Fuzzy Fonts in OS X web browsers

My better half and I recently upgraded our personal computers from aging Sony Vaio laptops, to twin 2013 iMacs. We've wanted to dip our feet in the OS X waters for some time now, and this seemed to be a good opportunity.

I'd always supposed Apple Macs to be a paragon of typography perfection, so imagine my surprise when I discovered the great "fuzzy font" controversy surrounding recent OS X releases.

There's a lot been written about this elsewhere so I won't repeat it:

Anyway, the problem seems to be particularly noticeable when light text on a dark background is used. Unfortunately, this directly affects my Prodigal Sounds family of web sites. 

Thanks to a post by Tim Van Damme, I found a work-around: to add a block to the web site's CSS stylesheet:

html {
-webkit-font-smoothing: antialiased;

This seems to force the browser (at least, Safari and Chrome) to work as though the LCD font smoothing option was turned off, and the text is more readable for those browsing on Macs, even if they have the LCD font smoothing option enabled.

If you're on a Mac, you can see the results here at

Implementing a Proper() case function in PL/SQL

I found that I needed to convert some raw column names (i.e. THIS_IS_MY_COLUMN) into a human-readable "Proper" or "Title" case form, e.g. This Is My Column.

Scouring the web, I didn't find out I could steal, so here's my implementation:

   function Proper(
p_string varchar2
) return varchar2
l_string varchar2(200);
l_proper varchar2(200);
l_proper := '';
-- change any underscores to spaces:
l_string := replace(lower(p_string), '_',' ' );

for i in 1..length(l_string)
-- obviously the first character is UPPER:
if i =1 then
l_proper := l_proper || upper( substr( l_string, i, 1 ) );
-- if the character is preceded by a space, UPPER it:
if substr( l_string, i-1, 1 ) = ' ' then
l_proper := l_proper || upper( substr( l_string, i, 1 ) );
l_proper := l_proper || substr( l_string, i, 1 );
end if;
end if;
end loop;
return l_proper;

From the "Look at me, I'm an idiot" department

If this was April 1st, I'd be sure that this was satire, but as it is, I'm not sure:

Java SE - Change in Version Numbering scheme in March 2013

In case this vanishes off the web, here's the gist:

Seriously? I don't know who is to blame for this loss of sanity. Oracle, you are not so damn important that you need to re-invent this. My understanding is that there is a well-established convention for numbering incremental versions of a product, that allows for such concepts as minor feature releases and bug fix releases.

The scheme is nicely described here on Stack Overflow: How to do version numbers.

What is sad is that back in the Sun days, Java release did follow this numbering convention.

Exploring AUTHID with an eye to partitioning client data


Is it possible to design a multi-tenant architecture in an Oracle DB instance, such that different customers' data is transparently partitioned whilst keeping common code and common tables in a single location?

Possible Strategy

Consider the following architecture:


Here we have two schemas in an Oracle database instance:

  • U_Common contains common tables and stored procedures, packages, etc.
  • U_Cust_1 contains customer-specific tables, with private data. 
  • Client processes acting on behalf of Customer_1 are connecting to the U_Cust_1 schema. 

(You may assume we also have U_Cust_2, etc.)

Assume that all access is performed through stored procedures or similar code objects.
In order for sessions connecting to U_Cust_1 to execute specific procedures in U_Common, they need:

  • EXECUTE permissions on the objects in U_Common;
  • Either:
    o    explicit namespace prefixes, e.g. U_Common.DoSomething()
    o    a synonym object referencing the code object in U_Common 

In order to have common code objects that reference both common tables and customer tables, we need to create "template" versions of the tables in the U_Common schema:


However, when U_Cust_1 invokes the procedure in U_Common, the session can't "see" the customer-specific version of the table, because the code is executing in the context of U_Common. It will see the template table data (or lack of it - presumably there are no rows in the template tables).

This can be addressed by adjusting the AUTHID pragma when the code objects are defined:


  create or replace package U_Common [authid definer] is ..

Change to:

  create or replace package U_Common authid current_user is ..

According to the documentation, this AUTHID clause instructs Oracle on whether to run the code with the invoker's rights, or the owner's rights. (the default is DEFINER, or owner.)


In practice, this means that when U_Cust_1 executes the code object defined as AUTHID CURRENT_USER,  the customer tables in U_Cust_1 are in scope, and are accessed instead of the template tables in the U_Common schema.

Two interesting observations about executing code defined as AUTHID CURRENT_USER:

  • the code does NOT see the Common tables. Any table in U_Common is "out of scope";
  • the code can see other procedures and packages in U_Common, including those defined as AUTHID DEFINER (default);
  • when the code makes a nested call to those other procedures, the tables in U_Common are now visible!

 Here's a sequence diagram of the call stack:


This means that everything will work just fine, so long as the code objects are classified into "programs that access common data" and "programs that access customer data".

Concerning Views

Views don’t work so well. 

You can define a view in U_Common that joins a Common table with a Customer table, but any U_Common procedure that is authid current_user will not be able to "see" the view from the context of U_Cust_1.

You can try this:

grant SELECT on View to U_Cust_1
create synonym in U_Cust_1 for the U_Common.View

But the result set will be entirely pulled from the U_Common schema tables.

So this technique probably isn't going to be used to create a transparently customer-partitioned application architecture. But it is still worth understanding how AUTHID works.

A nice enhancement in PL/SQL Developer .NEXT

I've been taking a look at the public beta of the next version of PL/SQL Developer. I'm currently using the 8.0 release, choosing not to upgrade to 9.0 when it was released last year. One of the new features Allround Automations added in 9.0 was the ability for one instance of the application to open sessions to multiple databases. I considered this a pretty risky enhancement.

However, in the upcoming 10.0 release, this multi-session feature seems to be refined and more solid. One nice enhancement is the ability to assign a window background color to specific connections.

Here, I am connecting to the QA, UAT, and PROD environments: 

Now there is much less chance of accidently running a data update on the wrong DB! This feature would have been quite useful in 8.0 and earlier, frankly. It's nice to have an obvious reminder of which environment you're looking at.

Data Prospecting

Lisa and I drove into the city last Wednesday to attend the San Francisco SQL Server User Group meeting, to see a presentation of Microsoft's latest Data Mining tools. It was a great presentation, very enlightening.

I confess, I've always been under a misconception about data mining. For reasons I'll explain below, I've had a bias that "data mining" was all about exploring your data and looking for unexpected trends and relationships that perhaps could then be leveraged.

However, at the user group presentation, as William Brown took us through a data mining exercise, several things became clear to me:

In Data Mining projects, the following axioms are important:

  • Know what you're looking for
  • Know how to recognize it when you've found it

William started with the following goal: "Of my database of customers who haven't purchased a bicycle, which of them are likely to buy one?" 

He then used the Microsoft toolset to create a "sieve" through which he could pour the customer database and build a collection of likely bicycle purchasers.

This was possible because he started with a table of customers known to be bicycle purchasers, and used a subset of it to effectively "train" the sieve to identify rows with similar properties. He also used the remainder of the known data set to test the sieve's effectiveness.

This is almost the exact opposite of what I had presumed "Data Mining" was all about, but I stand corrected: "Data Mining" is an excellent description of the process. 

You go into the project knowing what it is you want to find, and you have a test that can be used to prove if you are successful. 

Digression: Arthur Ransome's Pigeon Post

In the book Pigeon Post by Arthur Ransome, a team of child explorers decide to go prospecting for gold on the local hills. They base this decision on an old rumor of a lost gold seam up in the hills. Adventures ensue, of course. It's a great book, with detailed descriptions of the prospecting, ore crushing, smelting, and charcoal burning processes. After an unsuccessful attempt to melt the gold dust into a nugget - it just oxides into a black lump - they know that just finding gold-colored ore isn't enough: They need a test to prove the result to their uncle, Captain Flint, who has given up on fruitless treasure hunts:

"It probably wasn't gold at all," said Susan. "We ought to have made sure."

Dick looked up suddenly. "There is one way we could," he said. "We could try with aqua regia. You know, a chemical test. The book says gold dissolves in aqua regia. If it does, we'd know for sure."

More ore is extracted, and later on Dick performs the test, working in Captain Flint's workshop laboratory. 

Dick's tests show that the ore does indeed dissolve in aqua regia (a mixture of two strong acids), but his joy is short-lived when Captain Flint returns and reviews the results:

"What's this? Chemistry?" asked Captain Flint.

"Gold," said Dick.

"Gold? Don't you go and get interested in the wretched stuff. I've sworn off. Had quite enough wasting time... What have you got in that test tube?"

"Aqua regia," said Dick. "And gold dust. And it's gone all right. Dissolved! Gold in the aqua regia. I was just a bit afraid it might not be gold, after all."

"But, my dear chap," said Captain Flint. "Aqua regia will dissolve almost anything. The point about gold is that it won't dissolve in anything else."

Captain Flint examines the ore and, using his knowledge of the geology of the surrounding hills, comes up with a better test to determine what, exactly, the prospectors have found.

Captain Flint's test reveals the ore to be copper pyrites, which in its way, turns out to be quite as good a discovery as gold would have been, and the adventure is a success after all. 

There's several insights we can pull from this story:

  • Choosing an appropriate "test for success" is critical.
  • Domain knowledge is very important, and the earlier you apply it, the better off you'll be.

But here's another thing: The prospectors weren't looking for copper. They were looking for gold. And without some help, they weren't ever going to understand what they had actually found.

This last point makes me think that there is something missing in the current Data Mining toolset.

Reductio ad memoria 

Over a decade ago, I worked on a project with some truly kick-ass data visualization tools: Silicon Graphic's MineSet. The project involved showing how a company could mix their own customer data with additional data sourced from a third-party, and throw it into a 3D visualization, allowing variation along five or more axes simultaneously.  

For example, in addition to X, Y, and Z axis, we also had color, density, size of the "splat", and probably some others that I can't recall. The thrill of taking this 3D model and spinning it around various viewing angles to see correlations between data axes - in real-time, in a web browser! - was, frankly, mind-blowing at the time.

Now, in my opinion the project's premise was flawed, because the nature of the third-party data didn't really allow for a 1:1 join with the primary data set - it was more of an exercise in psychohistory.

But the tool was brilliant. It even had an Internet Explorer ActiveX viewer control for browsers.  

One feature I would have liked to have seen was a "randomize axes" option, where you could cycle through the various available columns in your data set, mapping them to different aspects of the 3D view, and then spin the result around, looking for unexpected relationships.

Because it's the unexpected that is truly insightful. I want to see the copper pyrites - or whatever else is in there - without having to explicitly test for it.

I don't think this is Data Mining as it is understood today. But perhaps it might be called "Data Prospecting".


A quick web search indicates that, in October 2003, the SGI MineSet product was licensed for development and sale to Purple Insight, a Gloucester, UK-based company that now appears to be defunct. This is a shame, because these visualization tools were unsurpassed by anything I've seen to date. 

Lisa tells me that some of the underlying algorithms found their way into IBM's Many Eyes project

Note: By the way, all of Arthur Ransome's books about the young explorers are great, and I highly recommend them, for children of all ages.