您的位置:首页 > 数据库

我的rails项目从sqlserver到postgres迁移之路

2013-04-18 10:28 113 查看
    近期想把我们的某些rails项目从sqlserver迁移到postgres,由于我们的rails项目里夹杂很多的find_by_sql,而sqlserver与postgreSQL里很多语法/方法不同,需要sql的翻译处理。没有找到特别合适的插件做这件事情,只好自己动手。

    一. 环境:

      rails版本:3.0.7;  ruby版本:1.8.7;

    二. 主要步骤和方法:

    1.导数据 

        利用Navicat for PostgreSQL将数据从sqlserver拷贝到postgreSQL

    2.导完数据发现表无id,添加id

        添加id有各种方法,你都可以尝试。当然你也可以用我在下面写的ruby代码。

    3.添加id后发现id不自增长,为id添加sequence。

        具体sql操作方法参可以参考http://tian-wang.iteye.com/blog/1051227

        由于我们已有数据,所以设置sequence时要指定start with n,其中n = max(id) + 1 。

        特别要注意的是postgreSQL表名、字段名、序列名等最好不要用大写;序列名大写时会报错: relation "sequence_name" does not exist。

        还有一点需要ruby程序员注意的是postgreSQL中的单双引号与ruby中单双引号意义完全不同,一不小心容易出错。

        由于项目表数目太多,我写了一段ruby代码来完成这件事情(第二步和第三步),没有优化,先将就用着,至于怎么调用,不用多说了吧,哈哈:

module CommonExtensions
module OneTimeUsedForMigrate
extend ActiveSupport::Concern

#注意:postgreSQL的单双引号   和   ruby的单双引号不同,要特别注意
#postgreSQL表名、字段名、序列名等最好不要用大写
#序列名大写时会报错:ActiveRecord::StatementInvalid: PG::Error: ERROR:  relation "tables_id_seq" does not exist
module ClassMethods
def fixed_for_pk
tables = fixed_all_table_names
Common.fixed_deal_with_primary_keys(tables)
end

def fixed_deal_with_primary_keys(tables)
tables.each do |table|
if fixed_has_primary_key?(table)
pk = fixed_get_pk_name(table)
else
pk = fixed_set_primary_key(table)
end
if pk
sequence_name = fixed_add_sequence(table, pk, true, true)
fixed_associate_pk_and_sequence(table, pk, sequence_name)
end
end
end

def fixed_add_sequence(table,pk,reset_seq=false,force=false)
sequence_name = fixed_get_pk_sequence_name(table,pk)
if reset_seq  #强制删除主键上的sequence?
begin
delete_sql = "DROP SEQUENCE \"#{sequence_name}\" CASCADE "
Article.find_by_sql(delete_sql)
rescue
end
end
if force #删除主键为空的列
delete_null_id_sql = "delete from \"public\".\"#{table}\" where \"#{pk}\" is null"
Article.find_by_sql(delete_null_id_sql)
end
#设置主键从多少开始自增长
base_num =  fixed_get_sequence_based_num(table,pk)
add_sql = "CREATE SEQUENCE \"#{sequence_name}\"
START WITH :base_num
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1 "
Article.find_by_sql([add_sql, {:base_num => base_num}])
sequence_name
end

#序列从?开始增长?
def fixed_get_sequence_based_num(table,pk)
base_num_record = Article.find_by_sql("select max(\"#{pk}\") as max_num from \"#{table}\"").first
base_num =  base_num_record ? (base_num_record.max_num.to_i + 1) : 1
end

def fixed_associate_pk_and_sequence(table,pk,sequence_name)
sql = "alter table \"#{table}\" alter column \"#{pk}\" set default nextval('#{sequence_name}') "
Article.find_by_sql(sql)
true
end

#生成序列名
def fixed_get_pk_sequence_name(table,pk = false)
adaptive_pk = pk || fixed_get_pk_name(table)
"#{table}_#{adaptive_pk}_seq".downcase
end

#获取主键所在的列的名字(并不是主键名)
def fixed_get_pk_name(table)
return all_primary_keys[:col_name] unless all_primary_keys
.blank? fixed_get_assumed_pk(table) #{:col_name => pk_record.colname, :type => pk_record.typename, :pk_name => pk_record.pk_name} end #如果已有主键,会出错 def fixed_set_primary_key(table) assumed_pk = fixed_get_assumed_pk(table) return false if assumed_pk.nil? sql = "alter table \"#{table}\" add primary key (\"#{assumed_pk}\") " Article.find_by_sql(sql) assumed_pk end #猜测可能会成为主键的名字 def fixed_get_assumed_pk(table) sql = "select table_schema,table_name,column_name,data_type,column_default,is_nullable from information_schema.columns where table_name = '#{table}' " columns = Article.find_by_sql(sql).map{|col| [col.column_name, col.data_type]} possible_pk_cols = ["id","#{table}ID","#{table}_id"] possible_model_name = table.singularize possible_pk_cols << "#{possible_model_name}_id" << "#{possible_model_name}ID" possible_pk_cols.uniq! possible_pk_cols.each do |possible_pk_col| return possible_pk_col if fixed_test_column_is_pk?(possible_pk_col,columns) end nil end def fixed_test_column_is_pk?(test_col,columns) columns.select{|col| col[0] == test_col && col[1].downcase == "integer"}.size > 0 end def fixed_has_primary_key?(table) !Common.all_primary_keys
.blank? end def fixed_all_table_names system_tables = ["sysdiagrams", #sqlserver系统表 "othertables"] special_tables = [] #特殊表,或者不适用/不想加入本次处理的表 sql = "SELECT pg_catalog.pg_relation_filenode(c.oid) as \"Filenode\", relname as \"TableName\" FROM pg_class c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace LEFT JOIN pg_catalog.pg_database d ON d.datname = 'postgres' WHERE relkind IN ('r') AND n.nspname NOT IN ('pg_catalog', 'information_schema') AND n.nspname !~ '^pg_toast' ORDER BY relname" Article.find_by_sql(sql).map(&:TableName) - system_tables - special_tables end #{table_name => {col_name,type,pk_name}} def all_primary_keys sql = "select pg_constraint.conname as pk_name,pg_class.relname as table_name,pg_attribute.attname as colname,pg_type.typname as typename from pg_constraint inner join pg_class on pg_constraint.conrelid = pg_class.oid inner join pg_attribute on pg_attribute.attrelid = pg_class.oid and pg_attribute.attnum = pg_constraint.conkey[1] inner join pg_type on pg_type.oid = pg_attribute.atttypid where pg_constraint.contype='p' --and pg_class.relname = 'yourtablename' " res = {} pk_records = Article.find_by_sql(sql) pk_records.each do |pk_record| res[pk_record.table_name] = {:col_name => pk_record.colname, :type => pk_record.typename, :pk_name => pk_record.pk_name} end res end end end end

    4.开始大头活,做翻译工作

        a. 调整top n

            top n  =>  limit n

#todo:目前对"top ?"支持不好,sql最好不要写成这种形式,尽量用 top :top 这种形式
def adjust_top_part(sql)
focus_reg = /\s+top\s+(\d+|:\w+|\?)/
focus_match_sql = sql.match focus_reg
return sql.gsub(focus_reg,"") + " limit #{focus_match_sql[1]} " if focus_match_sql
sql
end

        b. 调整getdate

           getdate  => CURRENT_DATE

def adjust_get_date(sql)
sql.gsub(/getdate\(\s*\)/i," CURRENT_DATE ")
end

        c. 调整is_null

           is_null => COALESCE

def adjust_is_null(sql)
sql.gsub(/isnull/i,"COALESCE")
end

        d. 调整datediff

           datediff =>  我采取的方法
4000
是调整rails中的sql的同时,为postgreSQL添加datediff函数

def adjust_date_diff(sql)
focus_reg = /datediff\s*\(.*?(\,)/
sql.gsub(focus_reg){|match|
"datediff(" + self.adjust_date_str_for_date_diff(match.gsub(/datediff\s*\(/,"").gsub(/\s*\,$/,"")) + ","
}
end

-- Function: datediff(character varying, timestamp without time zone, timestamp without time zone)

-- DROP FUNCTION datediff(character varying, timestamp without time zone, timestamp without time zone);

CREATE OR REPLACE FUNCTION datediff(arg_mode character varying, arg_d2 timestamp without time zone, arg_d1 timestamp without time zone)
RETURNS integer AS
$BODY$BEGIN

if arg_mode = 'dd' or arg_mode = 'd' or arg_mode = 'day' or arg_mode = 'dy' or arg_mode = 'w' then

return cast(arg_d1 as date) - cast(arg_d2 as date);

elsif arg_mode = 'ww' or arg_mode = 'week' then

return ceil( ( cast(arg_d1 as date) - cast(arg_d2 as date) ) / 7.0);

elsif arg_mode = 'mm' or arg_mode = 'm' or arg_mode = 'month' then

return 12 * (date_part('year',arg_d1) - date_part('year',arg_d2)) + date_part('month',arg_d1) - date_part('month',arg_d2) +

case when date_part('day',arg_d1) > date_part('day',arg_d2)

then 0

when date_part('day',arg_d1) = date_part('day',arg_d2) and cast(arg_d1 as time) >= cast(arg_d2 as time)

then 0

else -1

end;

elsif arg_mode = 'yy' OR arg_mode = 'y' OR arg_mode = 'yyyy' or arg_mode = 'year' then

return (cast(arg_d1 as date) - cast(arg_d2 as date)) / 365;

end if;

END $BODY$ LANGUAGE plpgsql;


        f.  调整column like ‘%condition%’

           column like ‘%condition%’ => column::text like '%condition%' (这里用sql标准函数cast(column as type)可能会更好)

#todo: to modify this method to accept more db_column_name
def adjust_like_str(sql)
db_column_name = "columncode"
focus_reg = /#{db_column_name}\s+like/
sql.gsub(focus_reg,"#{db_column_name}::text like" )
end


        g. 重写存储过程

            rails项目中尽量不要用存储过程,不便于维护,如果是处于性能考虑可以在采取各种优化手段,这个没啥好说的,动手干吧。

        h. 重写标量值函数

           和上一步一样,体力活。

        i.  其他sql函数

           根据项目中用到的sql情况添加,比如left():

-- Function: "left"(integer, integer)

-- DROP FUNCTION "left"(integer, integer);

CREATE OR REPLACE FUNCTION "left"(integer, integer)
RETURNS integer AS
$BODY$
DECLARE
num ALIAS FOR $1;
str text := '';
n ALIAS FOR $2;
BEGIN
str := num::text;
if n >= length(str) then
return str::int;
end if;
return substring(str from 1 for n)::int;

END;
$BODY$
LANGUAGE plpgsql;


    5.调整find_by_sql、count_by_sql等

        将翻译sql的工作隐藏到find_by_sql、count_by_sql等方法中

module ActiveRecord
class Base
class << self
def find_by_sql(sql)
return nil if sql.blank?
adjusted_sql = adjust_sql_of_find_by_sql(sql) if need_adjusted?
connection.select_all(sanitize_sql(adjusted_sql || sql), "#{name} Load").collect! { |record| instantiate(record) }
end

def count_by_sql(sql)
return nil if sql.blank?
adjusted_sql = adjust_sql_of_find_by_sql(sql) if need_adjusted?
connection.select_value(sanitize_conditions(adjusted_sql || sql), "#{name} Count").to_i
end

private
def need_adjusted?
connection.class.to_s.downcase.include? 'postgres'
end
def adjust_sql_of_find_by_sql(sql)
case sql
when Array;
sql_dup = sql.dup
sql_dup[0] = Common.adjust_sql_to_postgre(sql[0])
sql_dup
when Hash; sql
else
Common.adjust_sql_to_postgre(sql)
end
end
end

end
end

    6.其他地方的小调整

    目前已经通过本地测试,等待预上线测试,更期待测试报告,哈哈。

pp.prog.pro@hotmail.com (hanpp@fwxgx.com已停用)
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: