您的位置:首页 > 其它

数据仓库项目中的IP转换成地区

2008-11-23 13:52 323 查看
在电子商务网站中,假如我们需要分析用户的地区的分布的时候,虽然一般的网站都有记录用户所在省份和地区的字段,但是一般的网站都不会去核实用户的这些信息,因此数

据的准确性就有比较大的偏差,甚至很多用户都不填写这个信息(如果不是必填项的话),那么这个时候用户注册时候的IP地址就显得可靠多了,但是要把IP地址转换成地区并

非易事,倒不是说这个技术很难实现,是因为IP分配规则太乱了,不象邮政编码或者区号那么定的那么死,所以这个就成为了一项比较浩大的工程。IP地址库有商业的可以购买

,也有免费的可以使用。我就是使用的广大网友经常使用的QQWry.Data 这个东东,我先叫人家把这里面的数据整理成我定好的格式(这是一项比较大的工程,3个人弄大概弄了2

天),然后导入到数据仓库中我的地区维度表。

数据仓库中的地区维度的表结构如下:

GeographyKey int

Province varchar(50)

City varchar(50)

IPSections text

我的源系统中的用户的表结构如下:

CustomerID varchar(50)

Account nvarchar(50)

RegisterTime datetime

IPAddress varchar(50)

UpdateTime datetime

数据仓库中的用户的表结构如下:

CustomerKey int

GeographyKey int

RegisterDateKey int

CustomerCodeAlternateKey varchar(50)

Account nvarchar(50)

IPAddress varchar(50)

RegisterTime datetime

UpdateTime datetime

接下来还是使用我们的好帮手SSIS。

其中关键在于中间的那个脚本组件,里面包含一个类IPCompare

' Microsoft SQL Server Integration Services user script component

2' This is your new script component in Microsoft Visual Basic .NET

3' ScriptMain is the entrypoint class for script components

4

5Imports System

6Imports System.Xml

7Imports System.Net

8Imports System.Data

9Imports System.Data.SqlClient

10Imports System.Math

11Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper

12Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

13

14Public Class ScriptMain

15 Inherits UserComponent

16

17 Private dataSet As DataSet = New DataSet()

18

19 Public Overrides Sub 输入0_ProcessInputRow(ByVal Row As 输入0Buffer)

20

21 Dim GeographyKey As Int32 = 0

22 Dim ipcurrent As String = String.Empty

23

24 Try

25 If (Row.IPAddress_IsNull()) Then

26 Row.GeographyKey = -1

27 Return

28 End If

29

30 ipcurrent = Row.IPAddress

31

32 If (ipcurrent.Length = 0) Then

33 Row.GeographyKey = -1

34 Return

35 End If

36

37 Dim i As Integer

38 For i = 0 To dataSet.Tables("DimGeography").Rows.Count - 1

39 Dim dtRow As DataRow = dataSet.Tables("DimGeography").Rows(i)

40 Dim ipsections As String()

41 ipsections = dtRow("IPSections").ToString().Split(CChar(";"))

42

43 Dim ipsection As String

44 For Each ipsection In ipsections

45 If (IpBetween(ipcurrent, ipsection)) Then

46 GeographyKey = Convert.ToInt32(dtRow("GeographyKey").ToString())

47 Exit For

48 End If

49 Next

50 If (GeographyKey <> 0) Then

51 Row.GeographyKey = GeographyKey

52 Return

53 End If

54 Next

55 GeographyKey = -1

56 Row.GeographyKey = GeographyKey

57 Catch

58 GeographyKey = -1

59 Row.GeographyKey = GeographyKey

60 End Try

61

62 End Sub

63

64 Public Overrides Sub PreExecute()

65 MyBase.PreExecute()

66

67 Dim conn As OleDb.OleDbConnection

68 Dim adapter As OleDb.OleDbDataAdapter

69 Dim sql As String = "select GeographyKey,IPSections from DimGeography"

70

71 conn = New OleDb.OleDbConnection(Connections.DataCenterDW.ConnectionString + ";pwd=work8520")

72

73 adapter = New OleDb.OleDbDataAdapter(sql, conn)

74

75 Try

76 conn.Open()

77 adapter.Fill(dataSet, "DimGeography")

78 Catch ex As Exception

79

80 Finally

81 conn.Close()

82 End Try

83

84 End Sub

85

86 Private Function IpBetween(ByVal targetip As String, ByVal ipsection As String) As Boolean

87 Dim ipstart As String = ipsection.Split(CChar("-"))(0)

88 Dim ipend As String = ipsection.Split(CChar("-"))(1)

89

90 If (IPCompare.IsGreaterOrEqual(targetip, ipstart) And IPCompare.IsLessOrEqual(targetip, ipend)) Then

91 Return True

92 End If

93

94 Return False

95 End Function

96

97End Class

98

99

这样就OK了。

附件里面我提供了这个包文件和地区维度表数据的下载,希望能够给朋友们提供一点微薄的帮助。

附件下载
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐