Julian @ Thales

Saturday, March 22, 2008

Blogger API

Using the Blogger API: of course, from JavaScript you are restricted by the same domain origin security requirement. Google has a nice API that allows JS to work... still trying to figure out a way to use this feature. The irony is that the demo I link to reads this very postings.

More programmable web

Lately the 'programmable web' has become more and more of a reality. Amazon, Google, and Adobe among others have made big contributions. Here are some:

Amazon S3

REST/SOAP based storage

Amazon SQS

SOAP based queuing

Amazon SimpleDB (still in beta)

similar to GoogleBase?
attribute-based semistructured repository

Amazon EC2

on demand virtual servers accessible via web services

GoogleBase

shared repository that can be queried using a SQL-type of language

GoogleGears Database

browser extension allowing the use of a local SQLite database (SQLite is just taking off: Adobe AIR also uses it)

GoogleGears WorkerThread

browser extension allowing multithreading
the threads are created locally; they are real OS threads spawned within the browser process
the threads communicate via text messages

On one hand, Adobe and Google bring web functionality to the desktop (stepping out of the browser sandbox: the AIR environment; GoogleGears); on the other hand, Amazon and Yahoo offer services that extend typical desktop or client/server functionality to the web.

I do not think the day is too far away when you will be able to compile a program online (if Office is going the software as service way, why not Visual Studio?) and deploy it to a virtual environment on the cloud, and run it there. Possibly, the only thing that is missing is a set of libraries to abstract the communication protocols (SOAP, XML-RPC, ATOM, etc... still too many).

Friday, March 21, 2008

Google Gears demo

Had some fun with Google Gears. To run this you need Google Gears installed, and it will not work with Safari.

Google Gears demo

Ruby and SQLite3 in Windows

If you need to run Ruby with SQLite in Windows, since you cannot set up the environment in the script like you can in Unix (#!), make sure you have the sqlite3.rb and sqlite3.dll in the same directory where the Ruby script is. Figuring this out caused me much grief.

Thursday, March 13, 2008

Postgres and Ruby

On Mac OS X the Postgres binaries are saved in /usr/local/bin. Here is a handy Postgres launcher that can be saved in the user’s directory (as postgres_start.sh for example):

su -l postgres -c "/usr/local/bin/pg_ctl start -D /Users/postgres/datadir"

Where datadir is the directory (owned by the postgres user) where Postgres has the data files.

To create a custom tablespace in Postgres, make an empty directory first and use that as the location for the tablespace in pgAdmin3.

Sample Ruby code to access Postgres; notice that the first line is needed to set up the environment; without it the require fails.

#! /usr/bin/env ruby
#
# original file src/test/examples/testlibpq.c
# Modified by Razvan
# Calls PL/SQL function in Postgres

require 'postgres'

def main
norecs = 0
pghost = "localhost"
pgport = 5432
pgoptions = nil
pgtty = nil
dbname = "razvan"

begin
conn = PGconn.connect(pghost,pgport,pgoptions,pgtty,dbname)

res = conn.exec("BEGIN")
res.clear
res = conn.exec("SELECT * FROM insertrt('another row')")

if (res.status != PGresult::TUPLES_OK)
raise PGerror,"RB-Error executing command.\n"
end

printf("\nRB-Results\n")
res.result.each do |tupl|
tupl.each do |fld|
printf("RB-%-15s",fld)
norecs = norecs + 1
end
end

res = conn.exec("END")
printf("\nRB-Records: %i\n", norecs)
res.clear
conn.close

rescue PGError
if (conn.status == PGconn::CONNECTION_BAD)
printf(STDERR, "RB-Connection lost.")
else
printf(STDERR, "RB-Error:" )
printf(STDERR, conn.error)
end
exit(1)
end #rescue
end #end def main

main #invoke code

This calls the following Postgres plpgsql function:

CREATE OR REPLACE FUNCTION insertrt(data character varying)
RETURNS bigint AS
$BODY$
DECLARE
id bigint;
BEGIN
id := 0;
IF EXISTS(SELECT * FROM "RTable") THEN
SELECT MAX("Id") INTO id FROM "RTable";
END IF;
id := id + 1;
INSERT INTO "RTable" ("Data", "Id")
VALUES(data, id);
RAISE NOTICE 'New id is %', id;
RETURN id;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
ALTER FUNCTION insertrt(character varying) OWNER TO postgres;
GRANT EXECUTE ON FUNCTION insertrt(character varying) TO postgres;

To execute this function do a SELECT * FROM insertrt( ‘parameter’ ). Interesting in the function, SELECT INTO variable. Also notice “ ‘s used to enclose field names and table names. The output of RAISE NOTICE is displayed by the Ruby console.

Since gems does not work with OS X Tiger’s Ruby, the Postgres adapter has to be built manually.

Tuesday, May 22, 2007

Back again

It has been some time since I managed to post any entries. Currently I am busy learning Ruby; not yet sure how it related to distributed systems, but since forever I have been looking for a language that would not be too syntactically twisted yet powerful enough - this little scripting language might just be it. So watch out for some projects coming soon.

At the same time, I have been playing with OS X's Automator. While the choice of actions is limited (really, who would need to script iCal or GarageBand actions?), the possibilities offered by a all-pervasive workflow engine seem intriguing. I'm not crazy about Applescript, but then there is a library that enables Ruby to perform Applescript actions :D

Thursday, December 07, 2006

Roger Wolter lecture notes

A few ideas from an article by Microsoft's Roger Wolter on the MS infrastructure support for reliability in connected systems 9 (in MS Architecture Journal, vol 8):

- SOA = connected systems
- services communicate through well defined message formats => reliability = reliability of the communication infrastructure
- message handling between services more complex than client/server because server must make client decisions
- message infrastructures in the MS world: MSMQ, SQLS Broker, BizTalk (also offers data transformation)
- problems:
1. execution reliability (handling volume): different technologies deal with this differently, e.g. stored procedure 'activation' in Service Broker
2. lost message (communication reliability)
3. data reliability

Monday, November 27, 2006

Quartz Composer

For the last few days I have been experimenting with Apple's Quartz Composer. While this is primarily a motion-design tool (and a very powerful one indeed), it is also an example of a very effective graphical programming environment. Prior to this, I had seen such tools in the Windows environment and was less than impressed, but QC is really amazingly powerful; you can parse structures, use variables, loops, and everything somehow fits together very well. I see uses for this metaphor in the BPEL world, at the very least, but the whole world of distributed computing seems a good fit for it.

By the way, this is the 'code' behind one of the rather phenomenal demos that can be found here.

Thursday, November 02, 2006

Web OS part II

To further illustrate the merging of Web, Database, and fileshare services, I'm currently involved in a Sharepoint installation process where data from users' file shares will be moved to the Sharepoint collaborative environment, with a SQL Server database as physical storage. Thus, the Internet replaces the file storage functionality, by delegating the actual storage to the SQL engine.

Saturday, October 21, 2006

MacOS run loops and console mode

Run loops do NOT run automatically in console mode, not even on the main thread. It kind of makes sense, run loops are one of the mechanisms that support GUI events. So you have to create a run loop manually when running in console mode; it can use multiple timers, and it will pre-empt the main thread, whose execution will only resume after the run loop's timers finish running.

Sunday, October 15, 2006

Adventures in multithreading

An insidious race condition arises in the following situation (which I encountered in Objective-C, but any language that passes by reference will allow for the same):

I have a consumer function which writes a message to a file or database and which can be called by multiple threads - so it is LOCKed. This function is called by threads generated by a loop, where the thread instantiation function takes as parameter a string which is modified by each loop iteration. E.g., in pseudo-C:

string msg;
for( i = 0; i < 10; i++ ){
fsprintf( msg, "parameter: %i", i );
launchThread( msg );
}

void launchThread( string parm ){
plock lock;
fprintf( fHandle, parm );
plock unlock;
}

Of course since the fprintf needs to be atomic (in order not to generate a bus error), it is the one that has to be locked. However, msg is a shared resource as well. If you run this code as it is you will get an output similar to the following:

parameter20
parameter20
parameter30

Instead of the expected:

parameter1
parameter2
parameter3

That is because msg is modified by the loop and by the time thread #x has picked it up, who knows what value it has - certainly not one in sync with #x.

The solution is to provide as parameter to the thread a full immutable copy of msg and not a reference to msg.

Monday, September 18, 2006

Again, GoogleMaps

..it seems it's an issue of latency. From a really fast connection it was able to recognize both Tokyo and Hong Kong. London is still un-geocodable though I am afraid.

Objective-C

I have recently started looking into Objective-C. In my experience, one of the biggest hurdles when learning C++ is understanding who does what; textbooks focus on OO and spend a lot of time on discussing buiding classes and coming up with silly examples, and little is said about how does this translate into executable code. With C you still have a pretty good idea how the code becomes machine code. With C++ the connection is broken; a class does not map to registries and you are left with a major gap in the continuum. The same problem (even to a larger extent) occurs with SQL or VM-type languages such as C#, Java, or Actionscript.

The ObjC instructions from Apple are the only ones I have seen so far that do a good job at explaining the runtime, and how OO constructs become procedural code. I am very impressed.

I have a first attempt at writing Mac OS X code here, a Cocoa front end to Unix queues. IPCS seems to not work with queues on Mac, but other than that the system calls seem to work quite well.

Friday, September 15, 2006

GoogleMaps v2

I finally rewrote the Maps application in a more objectified JavaScript. The source code is here and you can see it in action here. It seems that the Google Geocoder also does not recognize London, besides Hong Kong (actually, HK is sometimes recognized! what gives?) and Tokyo.

Flickr (FlickrMaps) uses it in a similar fashion.

Here is the UML Sequence Diagram of the interactions caused by this application:

Thursday, September 14, 2006

SQLite

...is very easy to use. To use from C (assuming gcc is the compiler):

- #include sqlite3.h
- compile like this: gcc -lsqlite3 file.c
- you really only need 3 API functions, sqlite3_open, sqlite3_exec, sqlite3_close
- for sqlite3_exec, you need to provide a callback that takes the number of columns, the name of each column, and the value of each column; the callback is called by the library for each component of the resultset
- you can create a database using 'sqlite3 database_name.db' from a shell prompt.

Monday, August 28, 2006

GoogleMaps

Since I will soon have my 'summer' vacation, here is a link to my first GoogleMaps app: a list of places I have been to. It's a V1 thing, hampered by my inadequate JavaScript skills, that I promise to improve.

I'm kind of surprised though that the Google Geocoder does not seem to recognize (cities in) Japan and China?? I even tried their online demo and it cannot find these two locations.

The API is here.

Of databases and connections

A few notes to self regarding SqlClient and OleDb connections:

link between program and data
opening a connection is expensive
hence connection pooling: ADO does not destroy the connection object even after you close it
the connection is kept in a pool
it is destroyed after a time out interval (60 seconds => disconnect in SQL Trace)
reusable connection: which matches the connection details(data store, user name, password)
to turn connection pooling off in OLE DB: append to ConnectionString 'OLE DB SERVICES = -2':
- significant differences: 6 seconds for 100 000 connections to ....?
- not using this leaves the connection in SQL logged in at the time of the initial logon even after it is closed and reopened in code
- the connection disappears from Activity Monitor when the program exits
- however, if a connection is closed and the program is still running, after a while it disappears from the Activity Monitor (after the time out)
- if the connection is not closed, it stays open in the Activity Monitor
- multiple connections are opened if a Connection.Open is issued even if they have the same authentication and data store
setting the connection to null/Nothing clears it from the pool (? does not seem to affect the Activity Monitor)
- if the connection is set to nothing without closing it, it shows in Activity Monitor
- not closing the connection causes it not to time out even when set to nothing
- it is not clear what effect has setting the connection to Nothing/Dispose-ing in OleDb
in ODBC: use the control panel (how do you turn it off???)
using the SqlClient instead of OleDb shows the application in Activity Monitor as .Net SqlClient Data Provider
using SqlClient seems to keep the connection alive even after closed for longer than OleDb (does it ever time out?)
using OleDb shows the application as the exe not the OleDb Data Provider

Ok that is a lot of notes to self. I'm investigating this stuff: it's fairly well known but when you have to debug performance problems every little details counts and you have to be considerably more careful reading the fine print.

Which reminds me, each OS should provide some kind of relational/transactional storage service. Unix/Linux/Mac OS already does - SQLite.

Wednesday, August 23, 2006

Web Operating System?

Various web cognoscenti have been ballyhooing the 'OS' concept in a web context: Google OS. This has more to do with the coolness factor of any new software development arena than with actual functionality provided by the respective software.

The Google suite offers the following:

GoogleDesktop (supposedly at the core of the 'OS', and a resource hog to boot!)
Audacity (audio editing)
Orkut (social networking)
GoogleTalk
GoogleVideo
GoogleCalendar
Writely(word processing)
Gdrive (internet data storage)

Other than Gdrive, none of the above belong to an OS.

Leaving coolness aside, there are genuinely innovative Web-based applications whose complexity is close enough to that of desktop-based applications. For example, computadora.de 's shell is not that different from Windows 95's shell as far as the basic functionality it offers. Flash is a kind of Win32 in this case.

On the middle layer, salesforce.com is a good example of application domain functionality provided by a Web-based layer. It should be entirely possible to offer a payroll processing service.

And yes, I have a computadora.de account. You can even upload mp3's there and play them using the integrated mp3 player – which I did, an Alejandro Fernandes song, in keeping with the Mexican origin of the software.

SQL 2005 Endpoints

SQL Server can act as an application server by the means of endpoints (listeners). These can be defined over TCP or over HTTP, and support SOAP, TSQL, service broker, and database mirroring payloads.

To create a SOAP endpoint, create the stored procedures or functions that provide the functionality. Then run a CREATE ENDPOINT ... AS HTTP... FOR SOAP. Important parameters: SITE, and the WEBMETHODs collection.

A SOAP request returns an object array or a DataSet object. The default SOAP wrapper is created by Visual Studio.

This is quite nice. If you need to use a data-centric web service, just create one directly in SQL. To use it, just define the Web reference in the VS IDE; this will make an object of type SITE with a endpoint member you can access the data exposed by the SQL Server (e.g. If your SITE parameter was set to 'mySite', and the endpoint was named 'myEndPoint', you have a mySite object available which has a myEndPoint member, which exposes the functions/stored procedures defined on the SQL Server).

Monday, August 21, 2006

getUserPhotos part II

Zuardi in the previous post means Fabricio Zuardi, and he is the author of a Flickr API kit: a (REST-based) implementation of the Flickr API client in Actionscript. For some reason, he forgot/overlooked to implement one of the methods in the API, getUserPhotos.