- Django==1.2
- Fabric==1.0.1
- SQLAlchemy==0.6.6
- psycopg2==2.4
./importdata.py <datadirname>
Imports data files to the currently select filename
fabric select_sqlite
Select sqlite database
fabric select_psqltest
Select postgresql database of test data
fabric select_psql2011
Select postgresql database of final data
fabric select_mysql2011
Select MySQ database of final data
The API can be accesed via:
from kddcup2011 import *
- tAlbum
- tArtist
- tGenre
- tRating
- tTrack
- tUser
- A simple selection might look like::
tArtist.select([tArtist.c.artist_id]).execute.fetchall()
More information can be found at the SQLAlchemy website.
- Return number of users in DB::
>> tUser.count().execute().fetchall() [(1000990L,)]
- Return number of albums in each genre (not including null)::
>> s = select([tAlbumGenre.c.genre_id, func.count(1)]).group_by(tAlbumGenre.c.genre_id) >> res = conn.execute(s) >> res.fetchall() [(14741L, 571L), (17863L, 6077L), (19484L, 490L), (20638L, 496L), ...
There are also things like func.avg
but remember to use these with the group_by
function.