Saturday, August 8, 2009

Ruby and Postgres

I've been trying to learn Ruby recently (rails comes later), and as an exercise I decided to write an ETL connector to load transaction data into my financial database. getting it to work with Postgres was a few steps so I thought I'd post it here as it's a pretty self-contained example.

First do these steps
  • install postgres-server-dev
  • sudo gem install postgres
  • sudo gem install pg
  • gem contents pg will allow you to view examples



  • Here's my code that converts the csv file and loads it up. I could not figure out how to turn of autocommit (we all know how I feel about autocommit) so I just start a transaction asap. I also couldn't figure out how to see how many rows are inserted with each insert statement like in perl, so I just used the PK. Enjoy!



    require 'rubygems'
    require 'pg'
    raise 'usage: ruby usbank.rb csvfile' if not ARGV[0]

    @conn = PGconn.new({:host=>'localhost',:user=>'DBUSER',:dbname=>'DBNAME'})
    @conn.setnonblocking(true)
    @conn.exec('BEGIN')

    def getid
    @conn.exec('select max(txn_id::numeric)+1 as txn_id from txn')[0]['txn_id']
    end

    file = File.new(ARGV[0], "r")
    file.gets

    begin_id = getid()

    while (line = file.gets)
    @array = line.split(",")
    date = @array[0]
    item = "#{@array[1]} #{@array[2]} #{@array[3]}"
    amount = @array[4]
    amount.chomp!
    puts "#{amount} #{date} #{item}"

    query = "insert into txn (date, amount, cat_id, item, source_id, txn_id)
    select
    \'#{date}\',
    #{amount},
    0,
    \'#{item}\',
    5,
    ( select nextval('txn_id_seq'))
    where not exists (
    select 1 from txn where date= \'#{date}\' and amount=#{amount}
    );\n
    "
    @conn.exec(query)
    end

    end_id = getid()
    puts "#{Integer(end_id) - Integer(begin_id)} updated."

    @conn.exec('COMMIT')
    @conn.finish
    file.close

    No comments:

    Post a Comment