Saturday, December 05, 2009
Google App Engine
Python confusion
- lists: L = ['a', 'b', 'this is another element', 1, [1, 2, 3]]. Can do a L.append(), len(L), L.pop(), etc
- tuples: T = 1, 2, 3, 4, 'this is a tuple element'. Or T1 = () for an empty tuple, T2= 'one element tuple', . None of the functions listed above apply.
- sets: S = {1, 2, 3, 'set element'}. The items must be unique and set functions are available. S = set(L) converts list to set (removes duplicates in the process).
Sunday, November 08, 2009
Quick note on ORM
Saturday, October 31, 2009
Cloud, AIR, GoogleHealth
Monday, October 26, 2009
Wither SQL?
Friday, October 23, 2009
Wednesday, October 14, 2009
Friday, October 02, 2009
More on ORM; optimization?
Then the question becomes, where is this data saved - perhaps in some raw extensions of the sparse arrays that hold the object member data.
Another interesting aspect (related to the sparse array storage system) is the kind of optimization, if any, that occurs at the SQL relational engine level. If there is optimization of any kind done at the I/O-sparse array level, this might conflict with the SQL optimization. Interesting stuff.
Which brings into question, is the optimization cottage industry a by product of the relational model? I have always found Oracle's optimization 'strategies' (the thick books dealing with that) somewhat ludicrous and antiquated. In order to do that really well, you need a deep understanding both of data and of sorting algorithms; with so many intervening layers (physical design, I/O patterns), even that understanding is corrupted. So if you can avoid a couple of grievous errors (e.g. multiple joins on non-indexed columns), you will do reasonably well. But then, the DBMS should be able to detect if you're about to make a grievous error (or perhaps the reporting tool, if you use one). So, why a thick book on optimization?
AIR and GoogleHealth
The code requires a (sqlite) database, and of course the HTML forms. However, the most important functionality is encapsulated in the file, so that should be enough for a quick start.
It's all, of course, ensconced in Subversion....
Wednesday, September 30, 2009
Subversion
I have not used source control systems much, and I am finding that setting it up, on a Windows machine, with open source IDE's (especially not Eclipse) is more painful than it should be - documentation somehow seems to assume you're either using Eclipse or a Unix system or both. Here is what seems to work for me:
- install Subversion
- create (in DOS) a directory where you will store the files: dirX
- in DOS: svnadmin create dirX (e.g.: D:\svn)
- in DOS: set EDITOR=notepad.exe
- in DOS, D:\>svn mkdir file:///svn/python (if python is the sub directory where you want to store a project); using a \ (eg svn\python) will cause svn to fail with a weird assertion
- do the initial load of the project in the subversion system: svn import D:\pythonsource\ file:///svn/python (assuming your project is in D:\pythonsource)
- you will get a message in Notepad - close it, and choose [c] in DOS to continue the process of loading the directory into subversion
- at this point you will have the original source, the subversion source, and when the IDE will check out from subversion it will create another project, so you can delete the initial source directory
- you might want to only include the source files from the initial load... and create the project to include everything; have to be careful here if you need additional libraries (eg developing Processing projects in the NetBeans IDE, which will need the additional core.jar added to libraries)
- set up the IDE's:
- NetBeans:
- use the TeamCheckout menu option
- use the URL as below (Aptana)
- you will be asked to create a new project to which the files will be downloaded
- if you do, be careful not to create a new Main class (assuming you have a Java project)
- so ideally the workflow is
- create the initial project in the IDE
- only keep the SRC directory
- create the SVN structure as above
- create the new project in the IDE based on a SVN checkout
- Aptana:
- open the SVN view
- create new Repository Location (right click in the SVN window)
- the URL will be file:///d:/svn/python
- then back to the SVN view to check out the project into an active project (right click on the repository)
- you will manipulate the files through the Team context menu (right click the file in the solution explorer) in the main Aptana view (not Pydev, if you are using it for Python files) - update the file, update the directory, then check it in
- if you import it into a new project, eg AIR, you will be able to specify all the parameters again so if you have some existing project parameters (eg startup form), you will need to manually make the necessary adjustments (for AIR, change the application manifest, application.xml; also you will need to reimport the AIRAliases.js file)
- at this point the code is checked out and available to use; remember to update/commit it to the repository
- with AIR specifically, you shouldn't commit the icons to the repository (and others such as the .project file)
Alternatively, (at least in NetBeans), once you created the first SVN connection, you can check in a project without going through svn import. Just write the source, then right click on it and choose SubversionCommit to sent it to the repository. You can still look at the history changes between different versions - not sure how well this works in an environment with multiple users though since the original codebase is your own.
More details here. Notice that having Subversion running will show the hard drive where you have the repositories with a different icon in Windows Explorer.
Monday, September 28, 2009
Oracle and objects
Create a custom type - which, other than data types, can include member functions (defined in two parts, the data and the function declarations, and the body containing the function definitions).
Create the table:
CREATE TABLE( person_typ pobject, ... )
Inserting the data is done this way:
INSERT INTO object_table VALUES ( 'second insert',
person_typ (51, 'donald', 'duck', 'dduck@disney.com', '66-650-555-0125'));
Notice the implicit constructor.
To call a method:
SELECT o.pobject.get_idno() from object_table o
This is cool. But usually objects are used in code. So how is the client code/databaset object chasm bridged over?
These objects should be stored alone, without relational data (row objects as opposed to column objects as in the example above).
CREATE TABLE person_obj_table OF person_typ;
Scanning the object table:
DECLARE person person_typ;
BEGIN
SELECT VALUE(p) INTO person FROM person_obj_table p WHERE p.idno = 101;
person.display_details();
END
Pointers to objects are supported via the REF type.
You can use a SELECT INTO to load a specific row object into a object variable.
You can implement database functions, procedures, or member methods of an object
type in PL/SQL, Java, C, or .NET as external procedures. This is a way to have the objects execute code defined externally. Only PL/SQL and Java code is stored in the database.
As far as consuming objects externally, one way is by the means of using untyped structures or by using a wizard to create strongly typed (Java) classes:
Object views, where you define a filter that interprets the rows in a table as an object, is an interesting innovation.
So does this really solve the impedance problem? It's not like you define an object in C# then persist it in the database, then deserialize it in the application again and call its methods. It's more like, you define an object in the database, and with some manual work you can map between it and a custom class you define in Java. You can define some of its methods in C# (using the Oracle Database Extensions for .NET) - how is that for multiple indirections?
The question is really, where do you want your code to execute. In the case discussed above, (defining member functions in .NET) Oracle acts as a CLR host for the .NET runtime; not unlike the way SQL Server external procedures (written in C and compiled as DLL's) used to run in an external process space. So the code executes outside the (physical) database process, but still inside a (logical) database layer. I still can't escape a nagging feeling that this is as database-centric a view of the application as they come. Usually the design of an application starts with actors modeling, etc, and the data layer is something that does not come into play until the end. Ideally, from an application designer's perspective, as I mentioned above, you should be able to just persist an object somehow to the database, and instantiate/deserialize it from the data layer/the abstract persistence without too much fuss. In the case of Cache this is made easier by the fact that the application layer coexists with the database layer and has access to the native objects (at least, if you use the Cache application development environment).
In the case of Oracle the separate spaces, database for storage/execution and application for execution pose the standard impedance discrepancy problem, which I am not sure is in any way eased by the OO features of the database.
An ideal solution? Maybe database functionality should be provided by the OS layer and the application development/execution environment should be able to take advantage of that.
Meanwhile, Microsoft's Entity Framework (actually, a rather logical development from ADO.NET) deals with this problem in the dev environment. What I have seen so far looks cool, just a couple of questions:
- can you start with the entities and generate (forward engineer) the database tables
- how is the schema versioned and how are evolutionary changes sync'ed
- how does the (obvious) overhead perform when there are hundreds of tables, mappings, etc.
Incidentally, using the Oracle ODP.NET driver in Visual Studio yields a much better experience with an Oracle database than using the standard MS drivers. You actually get a return (XML-formatted) when querying object tables (the MS driver reports it as 'unsupported data type') and can interact with the underlying database much more, including tuning advisor, deeper database object introspection, etc.
Even PostgreSQL (which I find quite cool actually) does portray itself as having object/relational features - table structures can be inherited.
Saturday, September 26, 2009
More on globals and classes in Caché
Class Definition: TransactionData
/// Test class - Julian, Sept 2009
Class User.TransactionData Extends %Persistent
{
Property Message As %String;
Property Token As %Integer;
}
Routine: test.mac
Set ^tdp = ##class(User.TransactionData).%New()
Set ^tdp.Message = "XXXX^QPR^JTX"
Set ^tdp.Token = 131
Write !, "Created: " _ ^tdp
Terminal:
USER> do ^test
... Created 1@User.TransactionData
Studio: Globals
tdp
^tdp = "1@User.TransactionData"
tdp.Message
^tdp.Message = "XXXX^QPR^JTX"
tdp.Token
^tdp.Token = 131
The order of creation is:
- create the class
- this will create the SQL objects
- populating the SQL table will instantiate the globals
- the globals are: classD for data, classI for index
Objects can be created (%New)/opened(%OpenId) from code, but to be saved (%Save: which will update the database), the restrictions must be met (required properties, unique indexes, etc).
Also, I finally got the .NET gateway generator to work: it creates native .NET classes that can communicate with Cache objects. Here is a sample of the client code:
InterSystems.Data.CacheClient.CacheConnection cn = new InterSystems.Data.CacheClient.CacheConnection("Server=Irikiki; Port=1972;" +
"Log File = D:\\CacheNet\\DotNetCurrentAccess.log; Namespace = USER;" +
"Password = ______; USER ID = ____");
cn.Open();
PatientInfo pi = new PatientInfo(cn);
pi.PatientName = "New Patient";
pi.PatientID = new byte[1]{6};
InterSystems.Data.CacheTypes.CacheStatus x = pi.Save();
Console.WriteLine(x.Message);
PatientInfo is a class defined in Cache, as follows:
Class User.PatientInfo Extends %Persistent
{
Property PatientName As %String [ Required ];
Property PatientDOB As %Date;
Property PatientID As %ObjectIdentity;
Method getVersion() As %String
{
Quit "Version 1.0"
}
Index IndexPatientName On PatientName;
Index IndexPatientId On PatientID [ IdKey, PrimaryKey, Unique ];
}
Easy enough, the getVersion() method is available to the C# code, as are the persistence and all the other methdods natively available in ObjectScript. The generated code is here.
Wednesday, September 23, 2009
Ahead of the curve?
- learning the Google Data API
- learning the Google Health API which rests on top of the Data API
- (re) figuring out some of AIR's limitations and features
- (re) figuring out some of JavaScript's limitations and features
- using the mixed AIR/JavaScript environment
In my experience this is pretty standard when dealing with new languages and platforms. 15 years on, still a struggle - but then probably one should be worried when one becomes too proficient in a language/platform because it's already obsolete by then.
Tuesday, September 22, 2009
Caché and ODBC
Monday, September 21, 2009
Sybase joins the healthcare fray
Their flagship product in the industry seems to be eBiz Impact, YAIP (yet another integration platform) in the vein of Ensemble, DBMotion, and perhaps even alert-online. I might have to revise my chart from a few posts ago.
Saturday, September 12, 2009
XMLite?
Thank you for the improved version
Ok enough ranting. Will be documenting the GH project next... update to follow.
Tuesday, September 08, 2009
Follow-ups
Tuesday, August 04, 2009
(very) Preliminary performance comparisons
Friday, July 31, 2009
Competition in high technology
I think this starts with the fact that the "IT industry" is in fact a multiple-headed beast, since so many other industries use it. So defining the industry in which these players compete is difficult in itself.

So basically some companies started in an industry vertical (Intersystems - healthcare) where they built a complete stack which then they exported to other verticals (finance for Intersystems), or to the "center", becoming integrated players (Cache is portraying itself as a general purpose "post-relational" database; my guess is that this moniker is an attempt to rebrand it as a mainstream competitor, kind of reframing the "hierarchical database with roots in 60's healthcare software" description; BTW, there is nothing wrong with this description, and I think it is a cool product with remarkable performance characteristics).
The challenge in this case is convincing a mainstream audience that a niche product originating in a vertical is indeed a viable proposition. Tough, especially since the ecosystem (e.g., reporting tools) is built around standards that for example Cache works around (e.g., the SQL "pointers").
Secondly, there are the pure vertical players (which I haven't really talked much about here such as Siemens and GE). They built their applications portfolios by acquisitions (so perhaps dbMotion is a potential acquisition target?) but they rely on the mainstream vendors from the "center" for the base technology (perhaps; e.g., Siemens uses MS SQL as the db engine for its HIS, but Epic uses Cache).
Then, there are the mainstream technology companies which are trying to move from the center (pure database platform) into verticals (Amalga). At this point they are obviously encroaching on the vertical vendors territory, be they pure vertical players or integrated players. How will companies compete on one segment while collaborating on others remains to be seen (vertical industry offering competition, collaboration at the platform level).
Fourth, there are niche players (dbMotion, SQLite, FairCom) which operate either in the vertical or in the center, but offering solutions appropriate for a specific vertical (e.g. FairCom having found a reasonalby comfortable place as the engine of choice for turnkey systems). As mentioned already, I would guess that at some point dbMotion's PE backers (if there are any) would be looking for an exit in the guise of a purchase by a vendor, either in the mainstream/center (more likely) or in the vertical, while SQLite or FairCom are likely, due to their more general (albeit niche) appeal, to survive on their own.
There are plenty of interesting companies I have not covered such as MCObject, db4o, Pervasive, OpenLINK, NeoTool, and perhaps even Progress. As time permits I might revisit this writeup to include them, and perhaps even do a nice boxes and arrows schema, as good strategy analysis always seems to demand!