当咱们聊数据品质的时辰,咱们在聊些甚么?

2021-02-05 11:38:16 dms@root 9

导读:

GENZHEDASHUJUXINGYEDESHENQIECHENGZHANG,SHUJUPINZHIYULAIYUCHENGWEIYIGERAOBUKAIDEHUATI,NEIDANGDASHIZAILIAOSHUJUPINZHIDESHICHEN,FANSHIHUILIAOSHENMENI?CONGSHENMESHISHUJUPINZHIQITOU。

一、甚么是数据品质

数据品质:一个评价法则维度供给一种丈量与办理信息和数据的体例。

BIANBIEFAZEWEIDUYOUZHUYU:

  • JIANGWEIDUYUYINGYEXUYAOXIANGHUNPEI,BINGQIEFENBIEPINGJIADEQIANHOUAICI;

  • LINGHUICONGMEIWEIDUDEPINGJIAZHONGNENGGOUHUOXU/BUNENGGOUHUOXUHUODESHENME;

  • ZAISHICHENHEZIBENWUXIANDEHUANJINGXIA,GENGHAODIJIESHUOHEBANLIMINGMUDASUANZHONGDEBULVAICI。

数据品质检点首要分为以下法则维度:

  • WANQUANXING(Completeness):YONGLAIMIAOXIEXINXIDEWANQUANSHUIPING。

  • DUYIXING(Uniqueness):YONGLAIMIAOXIESHUJUSHIBUSHICUNZAIFANFUJISHI,BUSHITIGUOSHENGXIANXIANYICI。

  • YOUYONGXING(Validity):YONGLAIMIAOXIEMOZIHUOSHUJUSHIBUSHIZHIZUYONGHUJIESHUODEQIANTI。FANSHIFUCONGMING、SHUJUFANLI、ZHANGDU、ZHIYU、QUZHIGUIMO、NEIRONGGUIFANDENGFANGMIANTINGZHISHUFU。

  • FENQIXING(Consistency):YONGLAIMIAOXIETONGYIXINXIZHUTIZAICHABIEDESHUJUJIHEXINXISHUXINGSHIBUSHIBUYI,GESHITI、SHUXINGSHIBUSHIHESHIFENQIXINGSHUFUGANXI。

  • JINGQUEXING(Accuracy):YONGLAIMIAOXIESHUJUSHIBUSHIYUQIDUIYINGDEKEGUANSHITIDETEDIANXIANGFENQI(XUYAOYIGEKENDINGDEHEKEBAIHOUDEQUANSHIJUZICANKAOYUAN)。

  • SHISHIXING(Timeless):YONGLAIMIAOXIECONGYINGYECHANSHENGDAODUIYINGSHUJUJINGQUECUNCHUBINGKEPUTONGCHACHAODESHICHENJULISHUIPING,YEJIAOSHUJUDEYANBUSHIZHANG,SHUJUZAISHISHIXINGSHANGYINGNENGJINNENGGOUTIEHEYINGYEXIANSHICHANSHENGSHIDIAN。

  • KETUOXING(credibility):YONGLAIMIAOXIESHUJUCHANSHENGSHIBUSHIHESHIKEGUANJILV。

MEIFAZEWEIDUNENGGOUXUYAOCHABIEDEHUAIBAOTILI、JIHUIHELIUCHENG。ZHEIJIUZHISHILESHIXIANJIANDIANPINGJIASUOXUYAODESHICHEN、KUANXIANGHERENLIZIBENHUIXIANXIANCHUCHABIE。SHUJUPINZHIDEJINSHENGBUSHIYIHUIERJIUDE,ZAIQINGXILINGHUIPINGJIAMEIWEIDUSUOXURENWUDEHUANJINGXIA,TIAOXUANNEIXIEYIHOUJIAOWEIHUOJIDEJIANDIANWEIDUHEFAZE,CONGYIDAONAN、YOUQIANRUSHENDIMANMANBIANCESHUJUPINZHIDEZHOUQUANBANLIYUJINSHENG。FAZEWEIDUDEKAIDUANPINGJIACHENGGUOSHIKENDINGJIXIAN,QIYUPINGJIAZEZUOWEICHIXUJIANCEHEXINXIGAILIANGDEYIJUBU,ZUOWEIYINGYECAOZONGLIUCHENGDEYIJUBU。

二、数据完全性

数据完全性维度大类下可细分为以下维度小类:

  • 非空束缚:描写检点工具是不是存在数据值为空的环境。如客户开户时,客户称号是必填项,不能显现为空的环境。

FEIKONGSHUFUBINIQINGYIDONGDE,JIANLVEDEJIANGBIANSHIZIDUANBUNENGWEIKONG,CHACHAOTILIYEBINIQINGYI,ZHIXUYAOSHEDINGXUYAOCHACHAODEZIDUAN,JINGYOUGUOCHENG sql CHAWENLIEZHIBUNENGWEIKONGBIANKE。JIANGWEIKONGDESHUJUCHAWENCHULAITINGZHIZHENGGAI。

GURANFEIKONGSHUFUNENGGOUJINGYOUGUOCHENGSHEZHIFEIKONGSHUFUDETILIXIANDINGSHUJUMEIFAXIERUSHUJUKU,RUOSHICHENGCHIZHEILEITILINENGGOUFANGZHIGUOHOUDESHUJUFEIKONGCHACHAO。

三、数据独一性

数据独一性维度大类下可细分为以下维度小类:

  • DUYIXINGSHUFU:MIAOXIETONGYIKEGUANSHITIZAICHABIEYINGYESHUJUJIHEDEXINXI,JINGZHENGHEHOUSHIDUYIDE,ZHENDUIFANGZHENFANSHISHIDANYIZHUJIANHUOJIEHEZHUJIAN,RUZHENGJIANFANLI+ZHENGJIANHAOMA+XINGMINGBUYI,ZEQIKEHUBIANHAOYINGDUYI。

JUGEJIANLVEDELIZI,DUYIXINGSHUFUZAISHOUYISHANGPUTONGJUYOUDUYIDEBIAOSHIZIDUANNENGGOUJIANDINGQIDUYIXING,ZAIYINGYESHANGNENGGOUJINGYOUGUOCHENGJIGEJIEQIAGANXIDEYINGYESHUXINGDUIKENDINGDUYIYINGYESHITI。RUOZAIZHEILEIHUANJINGXIANXIANSHUJUFANFUDETIMU,JIWEIBEILEDUYIXINGSHUFU。ZHEILEIHUANJINGRUOSHISHIDANYIDEYINGYEZHUJIAN,NENGGOUJINGYOUGUOCHENGDUIZHUJIANFENZUQUZHONGDETILICHACHAO,RUOSHISHIYINGYEJIEHESHUXINGJIANDINGDUYISHITIDEHUANJING,ZHINENGYINGYEZHIYUANTINGZHISHOUDONGCHACHAO。

四、数占有用性

数占有用性维度大类下可细分为以下维度小类:

  • DAIMAZHIYUSHUFU:MIAOXIEJIANDIANGONGJUDEDAIMAZHISHIBUSHIZAIDUIYINGDEDAIMABIAONEI。RUYINGYEFAZEJIESHUO“XINGBIE”DEQUZHIYINGDANGSHI“1-WEIZHIDEXINGBIE”、“2-NANXING”、“3-NVXING”、“4-WEISHENMINGDEXINGBIE”,RUOSHIXIANXIAN“A”、“B”RUXUDEQUZHI,ZEYIWEI“XINGBIE”DEDAIMAZHIYUCUNZAITIMU;

  • ZHANGDUSHUFU:MIAOXIEJIANDIANGONGJUDEZHANGDUSHIBUSHIZHIZUZHANGDUSHUFU。RU“JINRONGJIGOUBIANMA”ZAI《GUOMINYINXINGJINRONGJIGOUBIANMAGUIFAN》ZHONGHUADINGZHANGDUWEI14WEI,RUOSHIXIANXIANFEI14WEIDEZHI,ZEJIANDINGWEIBUZHIZUZHANGDUSHUFU,BUSHIYIGEYOUYONGDE“JINRONGJIGOUBIANMA”;

  • NEIRONGGUIFANSHUFU:MIAOXIEJIANDIANGONGJUDEZHISHIBUSHIGENJUBIRANDEQINGQIUHEGUIFANTINGZHISHUJUDELURUYUCUNCHU。RU“CUNKUANZHANGHAO”YINGJINHANSHUZI,RUOSHIXIANXIANZIMUHUOQIYUBUFAZIFU,ZEBUSHIYIGEYOUYONGDE“CUNKUANZHANGHAO”,BUZHIZUNEIRONGGUIFANSHUFU;

  • QUZHIGUIMOSHUFU:MIAOXIEJIANDIANGONGJUDEQUZHISHIBUSHIZAIYUJIESHUODEGUIMONEI。RU“SHOUXINEDU”QUZHIGUIMOYINGDAYUJISHI 0,RUOSHIXIANXIANXIAOYU 0 DEHUANJING,ZECHAOYUELEQUZHIGUIMODESHUFU,BUSHIYIGEYOUYONGDE“SHOUXINEDU”。

代码值域束缚

MIAOXIEJIANDIANGONGJUDEZHISHIBUSHIGENJUBIRANDEQINGQIUHEGUIFANTINGZHISHUJUDELURUYUCUNCHU。

LI 1 : YIYINGYEFAZEXINGBIEZHIYAO “0:NAN” ,”1:NV”,ZEXINGBIEZIDUANZHIYINGXIANXIAN0HUO1。

LI 2 : HUOQUANDAIMA (CURCODE) ZHIYINGYOURMBHUOSHIUSDZHI。

SHUJUPINZHIZHONGDAIMAZHIYUQISHOUYAOZHIDINGQIYEJIDETONGYIBIANMABIAO,ERHOUGENJUDUIBIGANXITINGZHI etl ZHUANHUAN,ZHIYUCHUBAOGAOZHIXUYAOJINGYOUGUOCHENG sql CHAWENBUZAIGUIMONEIDESHUZHIJIUNENGGOULE。

长度束缚

MIAOXIEJIANDIANGONGJUDEZHANGDUSHIBUSHIZHIZUZHANGDUSHUFU。

BIFANGSHENFENZHENGHAOSHI 18 WEI。

ZHANGDUSHUFUNENGGOUJINGYOUGUOCHENGJIANBIAOSHIZHIDINGZIFUZHANGDUQUXIANDING,RUOSHIYINGYEXITONGZUIHOUBUZUOXIANDING,ZHINENGJINGYOUGUOCHENG sql JIANDINGZHANGDUDETILIHUODEFEICHANGZHIZAITINGZHICHUZHI。

内容规范束缚

描写检点工具的值是不是根据必然的请求和规范停止数据的录入与存储。

比方:余额或日期等普通城市根据牢固范例存储,若是最后设想为字符型后续应根据对应范例调剂。
起首这类环境最好一起头就建立好同一规范,根据营业寄义去指定手艺范例。若是最后做的不好,能够经由过程范例停止数据探查,对数据同一格局化。

取值规模束缚

描写检点工具的取值是不是在预界说的规模内。

比方:余额不能为正数,日期不能为正数等等。
若是营业初始不做限定,只能经由过程 sql 去对数据过滤查问,对有题目数据集合 etl 处置。

五、数据分歧性

数据分歧性维度大类下可细分为以下维度小类:

  • DENGZHIFENQIXINGYIKAOSHUFU:MIAOXIEJIANDIANGONGJUZHIJIANSHUJUQUZHIDESHUFUFAZE。YIGEJIANDIANGONGJUSHUJUQUZHIBIXUYULINGWAIYIGEHUODUOGEJIANDIANGONGJUZAIBIRANFAZEXIAXIANGCHENG。

  • CUNZAIFENQIXINGYIKAOSHUFU:MIAOXIEJIANDIANGONGJUZHIJIANSHUJUZHICUNZAIGANXIDESHUFUFAZE。YIGEJIANDIANGONGJUDESHUJUZHIBIXUZAILINGWAIYIGEJIANDIANGONGJUZHIZUMOUYIQIANTISHICUNZAI。

  • LUOJIFENQIXINGYIKAOSHUFU:MIAOXIEJIANDIANGONGJUZHIJIANSHUJUZHILUOJIGANXIDESHUFUFAZE。YIGEJIANDIANGONGJUSHANGDESHUJUZHIBIXUYULINGWAIYIGEJIANDIANGONGJUDESHUJUZHIZHIZUMOUZHONGLUOJIGANXI(RUDAYU、XIAOYUDENG)。

等值分歧性依靠束缚

PUTONGZHIWAIJIANJIEQIAGANXIDECHANGJING。BIFANG:BAODANBIAO,LIPEIBIAODEBAODANHAOCUNZAIBAODANZHUBIAO,TONGYIZHANGBIAO,LIANGGEZIDUANZHIJIANDEJIEQIAGANXIGANXI。

存在分歧性依靠束缚

首要是夸大营业的接洽干系性,一个状况产生了则某个值必然会若何。
比方:投保状况为已投保,则投保日期不应为空;

逻辑分歧性依靠束缚

SHOUYAOKUADADESHIZIDUANJIANDEXIANGHUSHUFUGANXI。

BIFANG:TOUBAOQITOUSHICHENXIAOYUJISHITOUBAOJUNSHISHICHEN。

六、数据精确性

数据精确性首要是指取值的精确性,描写该检点工具是不是与其对应的客观实体的特点相分歧。

BIFANG:TOUBAORENDEXINGBIEDAIMAWEI0-NVXING,GURANZHIZUDAIMAZHIYUSHUFU,DANQUEBUZHIZUQUZHIJINGQUEXINGSHUFU,YINWEIGAIBAOCHOUNANXING,QIXINGBIEDAIMAYINGWEI1-NANXING;

ZAIRU:GUOJIBAOHANYINGYEDESHOUXUFEIYINGLURUWEIGUOJIBAOGUANSHOUXUFEIZHICHU,QUELURUCHENGGUOJIBAOGUANSHOUXUFEIZHICHU。

精确性请求不只数据的取值规模和内容规范知足有用性的请求,其值也是客观实在天下的数据。因此可知,有用的数据必然是精确的,反之建立。
精确性凡是须要营业职员或其余当事人手工核对。

KANDAIZHEILEIHUANJING,SHUJUPINZHIFAZEMEIFANGFAJIANJIETONGYICHUZHI,ZHINENGJINGYOUGUOCHENGJIBIANCHAWENDETILIDUISHUJUCHENGGUOTINGZHIJUTIHEDUI。

七、数据实时性

实时性束缚:描写检点数据可否实时反应其对应的现实营业的时点状况。

BIFANG:XITONGZHONGCUNKUANWUJIFENLEIDEFENLEIBIXIANSHIZHONGDETIZAOJITIANBIANGENG;ZAIRULICAIYINGYEZAILICAIXITONGZHONGSHISHENGLIZHUANGKUANG,DANZAIJIAODIANXITONGZHONGQUEYINTONGXUNDEYUANYINCIBURUZHANG。

SHISHIXINGYINWEIDUOGEXITONG、TONGXUNDENGYUANYINCIXINGCHENG,FANSHIXUYAOYINGYEZHIYUANHUOXITONGZHIYUANSHOUGONGHEDUI。

PUTONGLAIJIANGSHUJUTONGBUDOUSHIJIYUYINGYEXITONGDELUOBIAOSHOUYIZIDUAN(BIFANG:CREATE_DT),ERSHIZAIYINGYECHANSHENGDESHICHENNENGGOUYUGAIZIDUANCUNZAISHICHENJULI。NENGGOUJINGYOUGUOCHENGJIANLVEDEsqlDUILIANGGESHICHENBINI,JIANDINGSHUJUDESHISHIXINGSHIBUSHIHESHIXUYAO。

八、数据可托性

数据可托性束缚:描写在数据同步中逐日/月增量数据是不是合适实际的经历值。

BIFANG:BAOSHUANGSHUJUDEZHURIFENQUSHUJUJIAOQIANRIPUTONGYOU 10% ZENGJIA,ERANSHUJUZENGJIABIANWEI200%,ZHEILEIHUANJINGYOUNENGGOUSHISHUJUTONGBUXIANXIANTIMU。

ZAIRU:MEIGEYUEDEYINGSHOUZONGEPUTONGDOUANBIRANJILVXIADIE,ERANSHUJUDONGYAOJIAODAZEPUTONGDOUNENGGOUXIANXIANTIMU。

KETUOXINGQINGQIUSHUJUDEZONGLIANGDONGYAOHESHIGENJIKEGUANJILV,PUTONGJINGYOUGUOCHENGDUI 7、15、30 RISHUJUTINGZHIBINI,RUOSHIXIANXIANCHAYIJIAODAZETINGZHIJUTIDETIMUTANCHA。

DUIYUDAMEISHENG:

BEIJINGDAMEISHENGRUANJIANGUFENWUXIANGONGSI(YIXIAJIANCHENG“DAMEISHENG”,GUPIAODAIMA:430311)SHIYIJIAKUAPINGTAIZICHANQUANSHOUQISHUJUBANLI(ALIM)PINGTAIGONGJISHANG,NULIYUJINGYOUGUOCHENGZILIKESHIHUA、QINGLIANGHUAJIAODIANSHOUYI,JIYUGONGCHENGHEYUNWEIYITIHUASHUJU,WEIKEHUGOUJIAN“SHUZILUANSHENG(Digital Twin)”,DAZAOQUANSHOUQIZICHANBANLIYUDAIJIAJINSHENGCHUZHIDASUAN。

DAMEISHENGJINGYOUGUOCHENGSHIDUONIANSHOUYIDAMOHEJINGLIJILEI,CHUANGJIAN“SANWEIYITI”CHANWUHEBANSHI:

“1”GEYINQING:KUAPINGTAIKESHIHUAYINQINGeZWalker;

“2”GEPINGTAI:GONGCHANGKESHIHUADASHUJUBANLIPINGTAIPIMCenter/eZWalker TeslaKESHIHUAKAIPIPINGTAI;

“3”GELIYONG:GONGCHENGMINGMUBANLIXITONG PIMCenter PPM/YUNXIETONGJIYIJIAOXITONG PIMCenter HO /ZICHANBANLIXITONG PIMCenter APM。

达美盛为国度高新手艺企业、中关村高新手艺企业、《工程扶植规范化》理事会理事单元, ISO15926国际首家倡议成员,CFIHOS构造成员单元,并经由过程了ISO9001、ISO14001、OHSAS18001等品质系统认证;荣获2016年度最好数字化工场办事商、2016年度“根本设 施可视化资产办理最好利用奖”;其全资子公司四川达美盛工程设想无限公司具有煤油自然气设想天资。


标签:  数据品质 数据完全性 数据 数占有用性 数据分歧性