Julian @ Thales: Programmable web

Showing posts with label Programmable web. Show all posts

Friday, January 20, 2012

Database.com

I have started using database.com. Although the authentication (from Apex Data Loader) is a pain, I have to say I am quite impressed with how powerful the platform is. A couple of quick thoughts, and I will soon have more I hope, as I just started developing an application using it:

interesting to see how many of the features (value and display lists, row id references) are similar to (for example) Caché 's: basically nothing is ever new in computing it seems
its usability would be greatly increased I guess if it supported disconnected recordsets so that a client application can use it even without access to the cloud

This also proves that it does make sense in certain instance to maintain one's own platform/language/database stack.

More soon.

Friday, June 25, 2010

Google Analytics API

Very basic GA API project (very poorly hosted too!)

Monday, June 14, 2010

EMC xProc Designer

Yet one more visual designer.

Thursday, April 29, 2010

Twitter Python Mongo

...or how many buzzwords can you get in one title. Here is a shortish piece of code that pulls data from Twitter and inserts it into Mongo. Other than the shortness of the code (given what it accomplishes!), what is remarkable is the ease of use of the data that is passed around, with a minimum amount of marshalling: Twitter can return data in JSON which is the native Mongo format and Python can use with a minimum of tweaking (mostly to reduce the response from Twitter).


import urllib
import json
import string
from pymongo import Connection

def runQuery(query, pp, pages):
 ret = []
 for pg in range(1, pages+1):
  print 'page...' + str(pg)
  p = urllib.urlopen('http://search.twitter.com/search.json?q=' + query + '&rpp=' + str(pp) + '&page=' + str(pg))
  s = json.load(p)
  dic = json.dumps(s)
  dic = string.replace(dic, 'null', '"none"')
  dx = eval(dic)
  listOfResults = dx['results']
  for result in listOfResults:
   ret.append( { 'id':result['id'], 'from_user':result['from_user'], 'created_at':result['created_at'], 'text': result['text'] } )
  completeRet = {"results": ret}
 return completeRet
  
c = Connection()
d = c.twitterdb
coll = d.postbucket
res = runQuery('Iran', 100, 15)
ptrData = res.get('results')
for item in ptrData:
 coll.save(item)

A Twitter Python web service

Taking the code from the previous post: here is a Python web service that reads the Twitter feed for a given query and returns a subset of the results in JSON:


import urllib
import json
import string
import SimpleXMLRPCServer
from SimpleXMLRPCServer import SimpleXMLRPCServer
from SimpleXMLRPCServer import SimpleXMLRPCRequestHandler

def runQuery(query, pp, pages):
 p = urllib.urlopen('http://search.twitter.com/search.json?q=' + query + '&rpp=' + str(pp) + '&page=' + str(pages))
 s = json.load(p)
 dic = json.dumps(s)
 dic = string.replace(dic, 'null', '"none"')
 dx = eval(dic)
 listOfResults = dx['results']
 ret = []
 for result in listOfResults:
  ret.append( { 'id':result['id'], 'from_user':result['from_user'], 'created_at':result['created_at'], 'text': result['text'] } )
 completeRet = {"results": json.dumps(str(ret))}
 return str(completeRet)
  
class RequestHandler(SimpleXMLRPCRequestHandler):
 rpc_paths=('/RPC2')
 
server=SimpleXMLRPCServer(("localhost", 8000), requestHandler=RequestHandler)
server.register_introspection_functions()
server.register_function(runQuery, 'qry')
server.serve_forever()

More potential uses of this (including Google Apps, Mongo, or Processing) later. And here is how to use it (from Python):


>>> import xmlrpclib
>>> s = xmlrpclib.ServerProxy('http://localhost:8000')
>>> print s.qry('Bumrungrad', 10, 1)

Where the first numeric parameter is the number of records per page and the second, the number of page (max 100/15).

Wednesday, April 28, 2010

Twitter API

A bit of topical coding.... getting tweets regarding the situation in Bangkok:


>>> import urllib
>>> from xml.dom import minidom
>>> p=urllib.urlopen('http://search.twitter.com/search.atom?q=Bangkok')
>>> xml=minidom.parse(p)
>>> p.close()
>>> nodes=xml.getElementsByTagName('title')
>>> for node in nodes:
 print node.firstChild.NodeValue

It's the first time I try the Twitter API, and it seems simple enough!

Monday, April 26, 2010

Very basic Google Chart

create the URL
you can then pull it in Python:


>>> import urllib
>>> p=urlopen('http://chart.apis.google.com/chart?chs=250x100&chd=t:60,40,90,20&cht=p3')
>>> data = p.read()
>>> f = file('d:\\file.png', 'wb')
>>> f.write(data)
>>> f.close()

<br />

It's quite easy to build the URL based on the data in a Googledoc spreadsheet: (code modified from Google's own documentation)


try:
  from xml.etree import ElementTree
except ImportError: 
  from elementtree import ElementTree
import gdata.spreadsheet.service
import gdata.service
import atom.service
import gdata.spreadsheet
import atom
import string

def main():
 gd_client = gdata.spreadsheet.service.SpreadsheetsService()
 gd_client.email = '______________@gmail.com'
 gd_client.password = '________'
 gd_client.source = 'SpreadSheet data source'
 gd_client.ProgrammaticLogin()

 print 'List of spreadsheets'
 feed = gd_client.GetSpreadsheetsFeed()
 PrintFeed(feed)

 key = feed.entry[string.atoi('0')].id.text.rsplit('/', 1)[1]

 print 'Worksheets for spreadsheet 0'
 feed = gd_client.GetWorksheetsFeed(key)
 PrintFeed(feed)

 key_w = feed.entry[string.atoi('0')].id.text.rsplit('/', 1)[1]

 print 'Contents of worksheet'
 feed = gd_client.GetListFeed(key, key_w)
 PrintFeed(feed)

 return

def PrintFeed(feed):
 for i, entry in enumerate(feed.entry):
     if isinstance(feed, gdata.spreadsheet.SpreadsheetsCellsFeed):
  print 'Cells Feed: %s %s\n' % (entry.title.text, entry.content.text)
     elif isinstance(feed, gdata.spreadsheet.SpreadsheetsListFeed):
  print 'List Feed: %s %s %s' % (i, entry.title.text, entry.content.text)
  print ' Contents:'
  for key in entry.custom:
      print '  %s: %s' % (key, entry.custom[key].text)
      print '\n',
     else:
  print 'Other Feed: %s. %s\n' % (i, entry.title.text)


if __name__ == "__main__":
    main()

Friday, April 23, 2010

NHS Choices on GoogleApps

Here is the Google Apps version of the (Python) NHS Choices application I discussed in the previous posts.

I can't even begin to say how cool this is. 3 hours in Notepad (hence the crudeness) and we get the hospitals in the UK, from anywhere. This is really amazing.

The source code.

Saturday, April 10, 2010

Mongo, Python, and NHS Choices

Using Python, NHS open data (NHS Choices), and Mongo: for example, getting the name and the web sites of all the providers in the Wigan area (why Wigan? No idea, just that their football team ~~seems to be pretty bad~~ recently defeated Arsenal, and the name stuck with me).

Start the database: go to the bin subdirectory of the install directory, and type mongod –dbpath .\

I will connect to the database using the Python API (pymongo).

NHS choices offers several health data feeds:

News
Find Services
Live Well
Health A-Z (Conditions)
Common Health Questions

As mentioned, I will use the second; to access it, you need to get a password and a login (apply for one here).

The basic Python code to query for providers and extract their names and web addresses is this:

First, build a list of services, as per the NHS documentation (the service code and the location are two required parameters):

services = [[1, 'GPs'], [2,'Dentists'], etc]

Then, query the web service:

for x in range(0, len(services)):

endpoint='http://www.nhs.uk/NHSCWS/Services/ServicesSearch.aspx?user=__login__&pwd=__password__&q=Wigan&type=' + str(x)

usock=urllib.urlopen(endpoint)

xmldoc=minidom.parse(usock)

usock.close()

nodes = xmldoc.getElementsByTagName("Service")

for node in nodes:

website = node.getElementsByTagName("Website")

name = node.getElementsByTagName("Name")

if website[0].firstChild <> None:

xmldoc.unlink()

The response will have a 3-item dataset, the service type, the provider name, and the web site (if one exists).

Mongo is a bit different in that the 'server' does not create a database physically until something is written to that database, so from the console client (launch, in \bin\: mongo) you can connect to a database that does not exist yet (use NHS in this case will create the NHS database - in effect, it will create files named NHS in the current directory).

Creating the 'table' from the console client: NHS = { service : "service", name : "name", website : "website" };

db.data.save(NHS); will create a collection (similar to SQL namespaces) and save the NHS table into it. The mongo client uses JavaScript as language and JSON notation to define the tables.

To access this collection in Python:

>>> from pymongo import Connection

>>> connection = Connection()

>>> db=connection.NHS

>>> storage=db.data

>>> post={"service" : 1, "name" : "python", "website" : "mongo" }

>>> storage.insert(post)

Here is the full code in Python to populate the database:

import urllib

from xml.dom import minidom

from pymongo import Connection

print "Building list of services..."

services = [[1, 'GPs'], [2,'Dentists'], [3, 'Pharmacists'], [4, 'Opticians'], [5, 'Hospitals'], [7, 'Walk-in centres'],[9, 'Stop-smoking services'], [10, 'NHS trusts'], [11, 'Sexual health services'], [12,' DISABLED (Maternity units)'], [13, 'Sport and fitness services'], [15, 'Parenting & Childcare services'], [17, 'Alcohol services'], [19, 'Services for carers'], [20, 'Renal Services'], [21, 'Minor injuries units'], [22, 'Mental health services'], [23, 'Breast cancer screening'], [24, 'Support for independent living'], [26, 'Memory problems'], [27, 'Termination of pregnancy (abortion) clinics'], [28, 'Foot services'], [29, 'Diabetes clinics'], [30, 'Asthma clinics'], [31,' Midwifery teams'], [32, 'Community clinics']]

print "Connecting to the database..."

connection = Connection()

db = connection.NHS

storage = db.data

print "Scanning the web service..."

for x in range(0, len(services)):

print '*** ' + services[x][1] + ' ***'

endpoint='http://www.nhs.uk/NHSCWS/Services/ServicesSearch.aspx?user=__login__&pwd=__password__&q=Wigan&type=' + str(x)

usock=urllib.urlopen(endpoint)

xmldoc=minidom.parse(usock)

usock.close()

nodes = xmldoc.getElementsByTagName("Service")

for node in nodes:

website = node.getElementsByTagName("Website")

name = node.getElementsByTagName("Name")

namei = name[0].firstChild.nodeValue

if website[0].firstChild <> None:

websitei = ' ' + website[0].firstChild.nodeValue

else:

websitei = 'none'

post = { "service" : x, "name" : namei, "website" : websitei }

storage.insert(post)

xmldoc.unlink()

To see the results from the Mongo client:

> db.data.find({service:5}).forEach(function(x){print(tojson(x));});

Will return all the hospitals inserted in the database (service for hospitals = 5); the response looks like this:

{

"_id" : ObjectId("4bc062dbc7ccc10428000032"),

"website" : " http://www.wiganleigh.nhs.uk/Internet/Home/Hospitals/tlc.asp",

"name" : "Thomas Linacre Outpatient Centre",

"service" : 5

}

Next, it might be interesting to try this using Mongo's REST API, and perhaps to build a GoogleApp to do so.

Saturday, April 03, 2010

O'Reilly's take on the IOS

Interesting, the Internet Operating System.

Wednesday, March 31, 2010

Everything is searchable

Interesting things happening while I wasn't watching. Not only the previously mentioned Data.gov, or the newspapers moving in the same direction, but also ...

Freebase
Wolfram Alpha Data

I'm really curious what effect will this have on databases. In theory, with the right authentication in place, everything could be exposed online and EDI would be vastly simplified, from sneakernet to HL7, everything would be replaced by REST calls.

At any rate, Freebase's attempt to organize everything is ambitious/stunning.

Wikipedia's own API, here.

Related: Talis.

Data.gov

US Government's open data initiative. Interesting, have to find out more about what's there. The potential for mashups and visualizations is great... if the data is trusted and current.
Here is the equivalent UK site.
I need to look into this some more, for now a lot of the data seems to be Excel files that can be downloaded. No universal REST/JSON access?

Thursday, January 21, 2010

Google App Engine and Python

Quick steps to develop with Google App Engine:

- download the SDK
- this will place a Google App Launcher shortcut on your desktop
- click on it
- File > Create New Application and choose the directory (a subdirectory with the app name will be created there)
- Edit and change the name of the main *.py file you'll be using (say, myapp.py)
- create your myapp.py file and save it in the subdirectory created 2 steps ago
- select the app in the Google App Launcher main window and click Run
- open a browser: http://localhost:808x (this is the default value, check in your application settings as defined in Google App Launcher)
- then you have to deploy it to your applications in your Google profile

More, later, this is just a quick vade mecum.

Saturday, December 05, 2009

Google App Engine

The Python dev environment. A very basic sample app that comes with the SDK - and which I managed to deploy without a lot of headache (and without having read the documentation!).

Saturday, October 31, 2009

Cloud, AIR, GoogleHealth

My new article is online at developer.com. Wish I would have added a few more images though. What to write about next?

Monday, October 26, 2009

Wither SQL?

More on the rise of non-relational databases. Maybe it is time for me to do another 'strategy' post... something to tie Cache, CouchDb and all the others together.

Tuesday, September 08, 2009

Follow-ups

Interesting link related to my previous posts on Intersystems, HL7, etc.

And SPARQL, something I should look into.

Thursday, July 23, 2009

Open Source, Cloud-based Approach to Describing Solution Architectures

Mike Walker discusses in a recent issue of the Microsoft Architecture Journal a set of tools that can be used to document solution architectures - based, not surprisingly, on Microsoft tools. Together, these make up the Enterprise Architecture Toolkit.

Since I don't have a Windows Server to run Sharepoint (I could, presumably, use Azure), I came up with a similar application setup using open source or cloud-based tools:

The only thing that needs to be built is the manager ("gateway", in the chart above) which can be a RIA application whose role is to tie everything together. Sounds simple enough?

Wednesday, July 15, 2009

Google Maps knows where you are

This is way cool: if you connect to the Internet using WiFi, Google Maps 'knows' where you are and shows your location by default.

Slowly it is all coming together - the 'cloud' means that you can keep your data (and processes!) in one place, and you can access it (via WiFi) from anywhere, even using a lightweight client. Also both the client and the cloud backend 'know' where you are so functionality can be tailored to the time/location.

I'm not sure how much computing power is needed on the (portable) client - probably, only enough for rich media rendering. Other than specialized applications, most that an average user really needs should be easily done using a client that combines media/communication/lightweight computing services. I don't think iPhone is there yet (as the all-purpose 'client'), but perhaps a combination of iPhone and Kindle, three versions from now, might become just that.

Sunday, July 12, 2009

XProc

Documentum's XProc XDesigner - a first step towards I see as a full online development environment, although this is more similar to Yahoo Pipes. The technology is there (web-based GUI + cloud for compilation and even possibly for deployment), I think it's only a matter of finding a way of monetizing it by tool developers. Is Microsoft really making money on Visual Studio though?