数据血缘关系:图数据库Neo4j存储实现 您所在的位置:网站首页 数据血缘分析工具包括哪些 数据血缘关系:图数据库Neo4j存储实现

数据血缘关系:图数据库Neo4j存储实现

2024-07-10 23:20| 来源: 网络整理| 查看: 265

背景

元数据管理包含内容较广,本篇介绍其中非常重要的数据血缘关系存储实现。

数据血缘关系包含了集群血缘关系、系统血缘关系、表级血缘关系和字段血缘关系,其指向数据的上游来源,向上游追根溯源。这里指的血缘关系一般是指表级和字段级,其能清晰展现数据加工处理逻辑脉络,快速定位数据异常字段影响范围,准确圈定最小范围数据回溯,降低了理解数据和解决数据问题的成本。同时数据血缘关系可与数据质量监控系统进行完美的整合,重要数据质量检测异常结果可通过数据血缘关系直接定位影响范围。

生成数据血缘关系的数据可通过数据库服务端部署监听或钩子程序对各种版本sql方言如oracle、greenplumn、mysql、hive、presto、sparksql和flinksql等进行解析获取,然后存储到图数据库Neo4j。

数据血缘关系,就是复杂关系网络。选择图数据库存储,是因为图数据库是基于图论实现的新型数据库,擅长处理点和边组成的复杂关系网络,执行和查询效率较传统关系型数据库具有无可比拟的优势。

本篇会讲解数据血缘关系上功能应用和图数据Neo4j安装使用与实例讲解。

数据血缘关系

通过对oracle、greenplumn、mysql、hive、presto、spark和flink多引擎等等覆盖,实现生成数据血缘关系、数据价值分析、影响度分析和数据存储生命周期管理。在生成数据血缘关系过程中或之上应用可包括以下功能应用:

数据血缘关系:数据血缘关系层级分类:集群、系统、表、字段血缘关系数据血缘关系上卷、下钻字段向上溯源检索:定位集群、系统、表、字段多层级别粒度展示(粒度可选),上游依赖路径长度展示(路径长度可选),字段加工逻辑清晰可见数据价值分析:在生成数据血缘关系或数据流向时,访问方式、访问集群、应用、系统、表、分区和字段数据访问频次、存储方式、访问部门、访问人...等维度。根据访问频次、价值密度、访问效率...评级数据热度。影响度分析:字段下游依赖检索:定位集群、系统、表、字段多层级别粒度展示(粒度可选),下游依赖路径长度展示(路径长度可选),实现影响度分析清晰定位。数据血缘关系与数据质量监控系统集成:任务调度执行任务->细到字段数据血缘关系(存在变化)->数据质量监控系统->字段级别质量检测->数据血缘关系

安装Neo4j(Mac)

Neo4j安装是Mac版安装相对简单,但前提是得安装Java jdk这里就不再赘述,读者自行安装,下面给出neo4j下载地址:https://neo4j.com/download-center/#community

笔者使用的社区版为例安装和讲解,Neo4j社区版和企业版的技术特性区别如下:

容量:社区版最多支持 320 亿个节点、320 亿个关系和 640 亿个属性,而企业版没有这个限制。并发:社区版只能部署成单实例,不能做集群。而企业版可以部署成高可用集群或因果集群,从而可以解决高并发量的问题。容灾:由于企业版支持集群,部分实例出故障不会影响整个系统正常运行。热备:社区版只支持冷备份,即需要停止服务后才能进行备份,而企业版支持热备,第一次是全量备份,后续是增量备份。性能:社区版最多用到 4 个内核,而企业能用到全部内核,且对性能做了精心的优化。支持:企业版客户能得到 5X10 电话支持。插件:还有企业版可以使用Bloom、ETL这些工具,社区版不支持。

无论是企业版还是社区版,对数据血缘关系的存储都已够用,这里使用的是社区版。

下载解压后切到相关目录 代码语言:javascript复制cd /Users/hyi/neo4j/neo4j-community-3.5.18/bin启动服务 代码语言:javascript复制./neo4j start停止服务 代码语言:javascript复制./neo4j stop在浏览器中打开管理界面, 安装本地登陆链接如下:http://localhost:7474/

首次登陆的默认用户名和密码:neo4j/neo4j 会提示修改用户名和密码,根据自己记忆存储能力设置一个密码,这里简单密码为000000,提示修改密码,修改后进入如下界面:

Neo4j基础知识

Neo4j是一个高性能的NOSQL图形数据库,它将结构化数据存储在网络上而不是表中。Neo4j最大的特点是关系数据的存储。它是一个嵌入式的、基于磁盘的、具备完全的事务特性的Java持久化引擎,但是它将结构化数据存储在网络(从数学角度叫做图)上而不是表中。Neo4j也可以被看作是一个高性能的图引擎,该引擎具有成熟数据库的所有特性。

Neo4j 是目前最流行的图形数据库,支持完整的事务,在属性图中,图是由顶点(Vertex),边(Edge)和属性(Property)组成的,顶点和边都可以设置属性,顶点也称作节点,边也称作关系,每个节点和关系都可以由一个或多个属性。Neo4j创建的图是用顶点和边构建一个有向图,其查询语言cypher已经成为事实上的标准。

图数据库Neo4j涉及内容较多,笔者这里仅讲最基础基本元素与概念,方便下面例子讲解。需深度学习的读者可参考其他资料学习。

节点

节点Node是图数据库中一个基本元素,用以表示一个实体记录,就像关系数据库中的一条记录,一个节点Node可包含多个属性Property和多个标签Lable。

关系

关系RelationShip在图论中成为边Edge,用来连接两个节点Node,其起始端和末尾段必须是节点Node。关系和节点一样可包含多个属性,但关系只有一个类型Type

属性

节点和关系都可以有多个属性,其是有键值对组成的。类型java的Map类型

路径

路径是由节点和关系组成的。路径也有长度的概念,即路径中关系边的条数。

Cypher是由Neo Technology公司为Neo4j而创建的一种图数据库查询语言,类似SQL语法,下面进行实例简单讲解其语法。

举个荔枝:

创建一个字段节点,有多个属性

代码语言:javascript复制create (job_type :field {fieldName:'job_type', dataType:'string', comment:'职业', level:'ods', dataBase:'ods_insurance', tableName:'insurance_fact_user_base_information', cluster:'insurance_cluster', from:'EDW'})

create (n):以这种语法形式创建节点,内可以添加标签、属性。可与其他节点创建关系和路径。

job_type:变量名称

field:标签,节点的标签可以多个。

fieldName:'job_type',dataType:'string'等为节点属性,节点可有多个属性.{}定义在大括号内,以键值对组成的。

再举个板栗:

匹配match两种标签field和table的节点,满足了where条件后,create创建关系,然后查找路径。

代码语言:javascript复制match (m:field),(n:table) where m.tableName = n.tableName and m.dataBase = n.dataBase and m.level = n.level create (m)-[:FROM]->(n) //查找路径 MATCH p=()-[r:FROM]->() RETURN p

Cypher查询语言涉及内容很多,读者可到Neo4j官网自行查阅,这里不再赘述。

数据血缘关系实例讲解

本例两张表字段直接简单映射,形成数据血缘关系。表insurance_fact_user_base_information用户基本信息表是源表(上游表),表insurance_fact_user_base_information1用户基本信息表1是目标表(下游表)。用户基本信息表字段都是用户信息中非常通用的维度,并且字段和表节点含有表、数据库和集群信息,可以做到多层级血缘关系。

这里给出了Neo4j脚本

创建两张表节点 代码语言:javascript复制create (insurance_fact_user_base_information:table {tableName:'insurance_fact_user_base_information',comment:'用户基本信息表',tableType:'FACT',updateType:'ALL',level:'ods',dataBase:'ods_insurance',comment:'xxx保险ods',cluster:'insurance_cluster',from:'EDW'}) create (insurance_fact_user_base_information1:table {tableName:'insurance_fact_user_base_information1',comment:'用户基本信息表1',tableType:'FACT',updateType:'ALL',level:'dwd',dataBase:'dwd_insurance',comment:'xxx保险dwd',cluster:'insurance_cluster',from:'EDW'})创建insurance_fact_user_base_information表字段节点代码语言:javascript复制create (user_id :field {fieldName:'user_id',dataType:'string',comment:'用户id',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (job_type :field {fieldName:'job_type',dataType:'string',comment:'职业',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (document_type :field {fieldName:'document_type',dataType:'string',comment:'证件类型',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (document_address :field {fieldName:'document_address',dataType:'string',comment:'证件住宅地址',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (document_no :field {fieldName:'document_no',dataType:'string',comment:'证件号码',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (issue_group_org :field {fieldName:'issue_group_org',dataType:'string',comment:'发证机关',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (name :field {fieldName:'name',dataType:'string',comment:'姓名',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (nation :field {fieldName:'nation',dataType:'string',comment:'民族',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (document_validate_date :field {fieldName:'document_validate_date',dataType:'string',comment:'证件有效期',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (age :field {fieldName:'age',dataType:'string',comment:'年龄',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (sex :field {fieldName:'sex',dataType:'string',comment:'性别',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (degree :field {fieldName:'degree',dataType:'string',comment:'学历',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (marriage_status :field {fieldName:'marriage_status',dataType:'string',comment:'婚姻状态',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (email :field {fieldName:'email',dataType:'string',comment:'邮箱',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (company :field {fieldName:'company',dataType:'string',comment:'公司名称',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (average_income_by_month :field {fieldName:'average_income_by_month',dataType:'string',comment:'平均月收入',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (latitude :field {fieldName:'latitude',dataType:'string',comment:'纬度',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (longitude :field {fieldName:'longitude',dataType:'string',comment:'经度',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (province :field {fieldName:'province',dataType:'string',comment:'所在省份',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (city :field {fieldName:'city',dataType:'string',comment:'所在城市',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (county :field {fieldName:'county',dataType:'string',comment:'县或区',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (detail_address :field {fieldName:'detail_address',dataType:'string',comment:'详细地址',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (graduationed_year :field {fieldName:'graduationed_year',dataType:'string',comment:'毕业年份',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (first_connect_mobile :field {fieldName:'first_connect_mobile',dataType:'string',comment:'第一联系人电话',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (first_connect_name :field {fieldName:'first_connect_name',dataType:'string',comment:'第一联系人名称',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (common_connect_name :field {fieldName:'common_connect_name',dataType:'string',comment:'一般联系人名字',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (common_connect_mob :field {fieldName:'common_connect_mob',dataType:'string',comment:'一般联系人电话',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (registered_mobile :field {fieldName:'registered_mobile',dataType:'string',comment:'注册手机号',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (user_status :field {fieldName:'user_status',dataType:'string',comment:'用户状态',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (children_info :field {fieldName:'children_info',dataType:'string',comment:'子女信息',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (other_info :field {fieldName:'other_info',dataType:'string',comment:'其他信息',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (create_time :field {fieldName:'create_time',dataType:'string',comment:'创建时间',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (update_time :field {fieldName:'update_time',dataType:'string',comment:'更新时间',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'}) create (batch_time :field {fieldName:'batch_time',dataType:'string',comment:'数据跑批时间',level:'ods',dataBase:'ods_insurance',tableName:'insurance_fact_user_base_information',cluster:'insurance_cluster',from:'EDW'})创建insurance_fact_user_base_information1表字段节点代码语言:javascript复制create (user_id :field {fieldName:'user_id',dataType:'string',comment:'用户id',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (job_type :field {fieldName:'job_type',dataType:'string',comment:'职业',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (document_type :field {fieldName:'document_type',dataType:'string',comment:'证件类型',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (document_address :field {fieldName:'document_address',dataType:'string',comment:'证件住宅地址',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (document_no :field {fieldName:'document_no',dataType:'string',comment:'证件号码',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (issue_group_org :field {fieldName:'issue_group_org',dataType:'string',comment:'发证机关',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (name :field {fieldName:'name',dataType:'string',comment:'姓名',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (nation :field {fieldName:'nation',dataType:'string',comment:'民族',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (document_validate_date :field {fieldName:'document_validate_date',dataType:'string',comment:'证件有效期',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (age :field {fieldName:'age',dataType:'string',comment:'年龄',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (sex :field {fieldName:'sex',dataType:'string',comment:'性别',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (degree :field {fieldName:'degree',dataType:'string',comment:'学历',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (marriage_status :field {fieldName:'marriage_status',dataType:'string',comment:'婚姻状态',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (email :field {fieldName:'email',dataType:'string',comment:'邮箱',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (company :field {fieldName:'company',dataType:'string',comment:'公司名称',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (average_income_by_month :field {fieldName:'average_income_by_month',dataType:'string',comment:'平均月收入',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (latitude :field {fieldName:'latitude',dataType:'string',comment:'纬度',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (longitude :field {fieldName:'longitude',dataType:'string',comment:'经度',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (province :field {fieldName:'usr_prv',dataType:'string',comment:'所在省份',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (city :field {fieldName:'city',dataType:'string',comment:'所在城市',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (county :field {fieldName:'county',dataType:'string',comment:'县或区',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (detail_address :field {fieldName:'detail_address',dataType:'string',comment:'详细地址',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (graduationed_year :field {fieldName:'graduationed_year',dataType:'string',comment:'毕业年份',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (first_connect_name :field {fieldName:'first_connect_name',dataType:'string',comment:'第一联系人名称',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (first_connect_mobile :field {fieldName:'first_connect_mobile',dataType:'string',comment:'第一联系人电话',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (common_connect_name :field {fieldName:'common_connect_name',dataType:'string',comment:'一般联系人名字',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (common_connect_mob :field {fieldName:'common_connect_mob',dataType:'string',comment:'一般联系人电话',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (registered_mobile :field {fieldName:'registered_mobile',dataType:'string',comment:'注册手机号',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (user_status :field {fieldName:'user_status',dataType:'string',comment:'用户状态',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (children_info :field {fieldName:'children_info',dataType:'string',comment:'子女信息',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (reason :field {fieldName:'reason',dataType:'string',comment:'备注',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (other_info :field {fieldName:'other_info',dataType:'string',comment:'其他信息',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (task_serials :field {fieldName:'task_serials',dataType:'string',comment:'任务流水号',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (create_time :field {fieldName:'create_time',dataType:'string',comment:'创建时间',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (update_time :field {fieldName:'update_time',dataType:'string',comment:'更新时间',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'}) create (batch_time :field {fieldName:'batch_time',dataType:'string',comment:'数据跑批时间',level:'dwd',dataBase:'dwd_insurance',tableName:'insurance_fact_user_base_information1',cluster:'insurance_cluster',from:'EDW'})创建上述字段节点与其表节点的所属关系代码语言:javascript复制match (m:field),(n:table) where m.tableName = n.tableName and m.dataBase = n.dataBase and m.level = n.level create (m)-[:FROM]->(n)创建两张表字段之间的血缘关系 代码语言:javascript复制match (m:field),(n:field) where m.fieldName = n.fieldName and m.tableName = 'insurance_fact_user_base_information' and n.tableName = 'insurance_fact_user_base_information1' create (m)-[:FROM]->(n)创建两个数据库节点信息 代码语言:javascript复制create (insurance:database {level:'dwd',dataBase:'dwd_insurance',comment:'xxx保险dwd',cluster:'insurance_cluster',from:'EDW'}) create (insurance1:database {level:'ods',dataBase:'ods_insurance',comment:'xxx保险ods',cluster:'insurance_cluster',from:'EDW'})创建两个数据节点里各自所有表的信息代码语言:javascript复制//创建ods数据库所含表节点 create (insurance_fact_user_base_information2:table {tableName:'insurance_fact_user_base_information2',comment:'用户基本信息表2',tableType:'FACT',updateType:'ALL',level:'ods',dataBase:'ods_insurance',comment:'xxx保险ods',cluster:'insurance_cluster',from:'EDW'}) create (insurance_fact_user_base_information3:table {tableName:'insurance_fact_user_base_information3',comment:'用户基本信息表3',tableType:'FACT',updateType:'ALL',level:'ods',dataBase:'ods_insurance',comment:'xxx保险ods',cluster:'insurance_cluster',from:'EDW'}) create (insurance_fact_user_base_information4:table {tableName:'insurance_fact_user_base_information4',comment:'用户基本信息表4',tableType:'FACT',updateType:'ALL',level:'ods',dataBase:'ods_insurance',comment:'xxx保险ods',cluster:'insurance_cluster',from:'EDW'}) create (insurance_fact_user_base_information5:table {tableName:'insurance_fact_user_base_information5',comment:'用户基本信息表5',tableType:'FACT',updateType:'ALL',level:'ods',dataBase:'ods_insurance',comment:'xxx保险ods',cluster:'insurance_cluster',from:'EDW'}) create (insurance_fact_user_base_information6:table {tableName:'insurance_fact_user_base_information6',comment:'用户基本信息表6',tableType:'FACT',updateType:'ALL',level:'ods',dataBase:'ods_insurance',comment:'xxx保险ods',cluster:'insurance_cluster',from:'EDW'}) create (insurance_fact_user_base_information7:table {tableName:'insurance_fact_user_base_information7',comment:'用户基本信息表7',tableType:'FACT',updateType:'ALL',level:'ods',dataBase:'ods_insurance',comment:'xxx保险ods',cluster:'insurance_cluster',from:'EDW'}) //创建dwd数据库所含表节点 create (insurance_fact_user_base_information1:table {tableName:'insurance_fact_user_base_information2',comment:'用户基本信息表2',tableType:'FACT',updateType:'ALL',level:'dwd',dataBase:'dwd_insurance',comment:'xxx保险dwd',cluster:'insurance_cluster',from:'EDW'}) create (insurance_fact_user_base_information2:table {tableName:'insurance_fact_user_base_information6',comment:'用户基本信息表6',tableType:'FACT',updateType:'ALL',level:'dwd',dataBase:'dwd_insurance',comment:'xxx保险dwd',cluster:'insurance_cluster',from:'EDW'}) create (insurance_fact_user_base_information3:table {tableName:'insurance_fact_user_base_information3',comment:'用户基本信息表3',tableType:'FACT',updateType:'ALL',level:'dwd',dataBase:'dwd_insurance',comment:'xxx保险dwd',cluster:'insurance_cluster',from:'EDW'}) create (insurance_fact_user_base_information4:table {tableName:'insurance_fact_user_base_information4',comment:'用户基本信息表4',tableType:'FACT',updateType:'ALL',level:'dwd',dataBase:'dwd_insurance',comment:'xxx保险dwd',cluster:'insurance_cluster',from:'EDW'}) create (insurance_fact_user_base_information5:table {tableName:'insurance_fact_user_base_information5',comment:'用户基本信息表5',tableType:'FACT',updateType:'ALL',level:'dwd',dataBase:'dwd_insurance',comment:'xxx保险dwd',cluster:'insurance_cluster',from:'EDW'})

上述数据初始化进去后,通过命令可以查看此例小范围数据血缘关系

代码语言:javascript复制MATCH p=()-[r:FROM]->() RETURN p

这是原生的Neo4j的展示的图不够美观,可进行再次开放封装。

在开发实现上可一键定位异常数据影响范围,如影响了下游哪些集群、系统或应用、表和字段多层级展示视图切换,向上溯源查找字段加工逻辑多层级展示同样如此。

总结

本篇讲述了数据血缘关系使用Neo4j存储,并给出例子实现创建血缘关系创建语句,实际应用中数据血缘关系是通过数据加载进去的。没讲这些血缘关系的元数据是如何从SQL或应用中解析获取的,有机会笔者再另分享。

数据血缘关系在元数据管理中是非常重要的内容,其不仅展示数据来龙去脉,还定位异常数据影响范围。如影响度分析,也是较为血缘关系应用的一部分,其用来分析数据的下游流向。当系统进行升级改造时,能动态数据结构变更、删除及时告知下游系统。通过依赖数据的影响性分析,可以快速定位出元数据修改会影响到哪些下游系统,哪些表和哪些字段。从而减少系统升级改造带来的风险。



【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

    专题文章
      CopyRight 2018-2019 实验室设备网 版权所有