Dubbo接口录制

Dubbo在RPC模式下,默认使用Hession2的编码,采用Dubbo协议传输,这两个特征决定了在传输层如果想拦截或者做分析变得很难;而这种拦截和分析,是限速限流、安全审计、访问控制等管理述求的技术基础。针对这些需求,Flomesh团队开发了Dubbo网关,其中一个功能就是“接口录制”:当流量经过Dubbo网关的时候,数据被处理成可以分析和检索的内容,包括分包、协议处理、编码(JSON)等。目前看,使用这个功能的场景主要有这么几个:

  1. REST到RPC的转换。无论是用Flomesh开发的Dubbo网关,还是用Dubbo社区的Dubbo Proxy,都需要用JSON来描述接口,这样才能把REST到RPC的转换落地。“接口录制”可以自动获取这个JSON,极大的方便了工作
  2. 数据记录与审计。当需要对Dubbo流量做审计的时候,用“流量录制”,可以把流量变成JSON的形式,方便进行分析和其他程序处理
  3. 数据脱敏。当流量经过的时候,可以把特定的字段做脱敏,比如把手机号码部分替换成****。这个操作在业务代码层面做,一方面难于满足很多后追加的数据安全管理规则,一方面很多既有代码做变更有一定难度。在网络层可以透明的解决这类问题

在这个演示中,我们介绍如何使用实现“接口录制”。

版权与免责

本文所述的piped程序为Flomesh团队开发并拥有版权(并非open source software或者freeware),这里作为测试提供。此文档及安装介质可以免费下载用于测试、学习、非盈利目的使用。对于非原文转载,请注明出处(http://flomesh.cn)。piped程序不得作为盈利目的使用。Flomesh团队会维护该文档确保有效性,维护piped程序确保稳定、可靠,但是Flomesh团队不对此文档和该程序的使用负有责任。

勘误,或者有不同意见和想法,或者需要交流的,可以给我们发邮件:support@polaristech.io,或者加微信13700113993

前提、拓扑与准备

我们假设阅读这个材料的人都具备基本Dubbo知识,包括在开发环境运行Dubbo服务端和客户端。

这个演示的拓扑非常简单:

  1. 独立运行的zookeeper进程,监听2181端口,没有访问控制。需要使用zookeeper自己带的zkCli.sh来操作zookeeper
  2. 运行在elicpse中的Dubbo服务provider和consumer,provider监听20880端口
  3. 独立运行的piped进程,在20881端口提供Dubbo服务,请求转发到运行在eclipse中的服务provider(端口20880),这个过程中记录Dubbo的请求和回复

获取演示代码

首先获取dubbo-sample-zookeeper演示代码:

git clone git@github.com:flomesh-io/dubbo-samples-zookeeper.git

获取piped程序

piped程序是Dubbo网关的核心程序

wget http://repo.polaristech.io/piped/piped-0.1.0-62.el7_pl.x86_64.rpm
yum -y localinstall piped-0.1.0-62.el7_pl.x86_64.rpm

或者

wget http://repo.polaristech.io/images/piped-alpine-0.1.0-62.tar.gz

Flomesh团队推荐使用RHEL7或者衍生版本(如centos7)运行piped

获取zookeeper

这个演示使用的是zookeeper 3.4.13版本:

wget 
https://archive.apache.org/dist/zookeeper/zookeeper-3.4.13/zookeeper-3.4.13.tar.gz

获取eclipse

演示使用的是linux版本的eclipse,和其他版本没有区别:

http://mirror.csclub.uwaterloo.ca/eclipse/technology/epp/downloads/release/2019-12/R/eclipse-java-2019-12-R-linux-gtk-x86_64.tar.gz

运行zookeeper

运行zookeeper非常简单:

tar xzvf zookeeper-3.4.13.tar.gz
cd zookeeper-3.4.13
cp conf/zoo_sample.cfg conf/zoo.cfg
bin/zkServer.sh start

确认zookeeper监听在2181端口:

root@localhost:~/zookeeper-3.4.13$ netstat -ntlp | grep java
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp6       0      0 :::2181                 :::*                    LISTEN      4534/java           
tcp6       0      0 :::41707                :::*                    LISTEN      4534/java      

这个时候,我们用zookeeper自带的zkCli.sh来访问zookeeper,如下是我运行的结果,通过ls /,可以看到有两个节点,dubbo和zookeeper:

root@localhost:~/zookeeper-3.4.13$ bin/zkCli.sh 
Connecting to localhost:2181
2020-03-18 23:05:38,621 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03, built on 06/29/2018 04:05 GMT
2020-03-18 23:05:38,623 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=localhost
2020-03-18 23:05:38,623 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_181
2020-03-18 23:05:38,624 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2020-03-18 23:05:38,624 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
2020-03-18 23:05:38,625 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/home/thomas/zookeeper-3.4.13/bin/../build/classes:/home/thomas/zookeeper-3.4.13/bin/../build/lib/*.jar:/home/thomas/zookeeper-3.4.13/bin/../lib/slf4j-log4j12-1.7.25.jar:/home/thomas/zookeeper-3.4.13/bin/../lib/slf4j-api-1.7.25.jar:/home/thomas/zookeeper-3.4.13/bin/../lib/netty-3.10.6.Final.jar:/home/thomas/zookeeper-3.4.13/bin/../lib/log4j-1.2.17.jar:/home/thomas/zookeeper-3.4.13/bin/../lib/jline-0.9.94.jar:/home/thomas/zookeeper-3.4.13/bin/../lib/audience-annotations-0.5.0.jar:/home/thomas/zookeeper-3.4.13/bin/../zookeeper-3.4.13.jar:/home/thomas/zookeeper-3.4.13/bin/../src/java/lib/*.jar:/home/thomas/zookeeper-3.4.13/bin/../conf:
2020-03-18 23:05:38,625 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
2020-03-18 23:05:38,625 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2020-03-18 23:05:38,625 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2020-03-18 23:05:38,625 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2020-03-18 23:05:38,625 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2020-03-18 23:05:38,625 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=4.15.0-30deepin-generic
2020-03-18 23:05:38,625 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=thomas
2020-03-18 23:05:38,625 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/home/thomas
2020-03-18 23:05:38,625 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/home/thomas/zookeeper-3.4.13
2020-03-18 23:05:38,626 [myid:] - INFO  [main:ZooKeeper@442] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@443b7951
Welcome to ZooKeeper!
2020-03-18 23:05:38,637 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1029] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2020-03-18 23:05:38,669 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@879] - Socket connection established to localhost/127.0.0.1:2181, initiating session
[zk: localhost:2181(CONNECTING) 0] 2020-03-18 23:05:38,681 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1303] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x10000040618000e, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null

[zk: localhost:2181(CONNECTED) 0] ls /
[dubbo, zookeeper]
[zk: localhost:2181(CONNECTED) 1] 

这个zkCli.sh的窗口不要关闭,后续我们在这里操作,完成一些查询、确认和注册Dubbo网关的工作

运行Dubbo网关(piped)

使用yum安装piped之后,piped进程被拷贝到/usr/local/bin/piped。piped是一个静态链接的程序,大小为700多K字节;如果使用容器镜像,我们提供的基于alpine的镜像,压缩包5M,导入docker后13M,其中还包含了一些常用的网络trouble shooting工具。

piped启动需要一个配置文件,把如下内容拷贝到/etc/piped/interface-recorder.ini中:

[common]
log_level = debug

[pipeline.dubbo]
listen = 0.0.0.0:20881

[module.1]
name = dubbo-request-decoder

[module.2]
name = hessian2-decoder

[module.3]
name = json-encoder

[module.dumpreq]
name = dump
filename = /tmp/dubbo.json

[module.4]
name = json-decoder

[module.5]
name = hessian2-encoder

[module.6]
name = dubbo-request-encoder

[module.7]
name = proxy
upstream = 192.168.122.1:20880

[module.8]
name = dubbo-response-decoder

[module.9]
name = hessian2-decoder

[module.10]
name = json-encoder

[module.dumpres]
name = dump
filename = /tmp/dubbo.json

[module.11]
name = json-decoder

[module.12]
name = hessian2-encoder

[module.13]
name = dubbo-response-encoder

piped的配置文件非常简单和直观:一个配置文件可以定义多个pipeline,每个pipeline中包含按照定义顺序执行的module(模块)。piped是基于流处理的,当数据流入时候,数据依次流过piped定义的模块

因为piped是基于流处理的,所以piped相比很多类似软件占用更少的内存,使用更少的计算资源,并且有更高的执行效率。采用流处理的原因之一是piped设计之初就是面向“云原生”(cloud native)的,我们希望它可以在容器环境中以sidecar方式很好的运行,配合控制平面完成流量管理。实际中,运行在非容器环境也是非常简单实用的

启动piped:

[root@localhost piped]# piped /etc/piped/interface-recorder.ini 
Loading configuration from file /etc/piped/interface-recorder.ini
---
Pipeline [dubbo] listening on 0.0.0.0:20881
  - dubbo-request-decoder [1]
  - hessian2-decoder [2]
  - json-encoder [3]
  - dump [dumpreq]
      .filename: /tmp/dubbo.json
  - json-decoder [4]
  - hessian2-encoder [5]
  - dubbo-request-encoder [6]
  - proxy [7]
      .upstream: 127.0.0.1:20880
  - dubbo-response-decoder [8]
  - hessian2-decoder [9]
  - json-encoder [10]
  - dump [dumpres]
      .filename: /tmp/dubbo.json
  - json-decoder [11]
  - hessian2-encoder [12]
  - dubbo-response-encoder [13]
---
Loaded 1 pipeline(s)
Wed Mar 18 22:32:56 2020 [debug] Session: 0xf016a0, allocated
Wed Mar 18 22:32:56 2020 [info] Listening on 0.0.0.0:20881

启动Provider

把dubbo-sample-zookeeper项目导入eclipse,然后打开ProviderBootstrap.java,然后“运行”,可以看到服务注册到zookeeper,并且服务处于启动状态: 启动Provider

确认一下20880端口已经监听:

root@localhost:~# netstat -ntlp | grep java
tcp6       0      0 :::2181                 :::*                    LISTEN      4534/java           
tcp6       0      0 :::41707                :::*                    LISTEN      4534/java           
tcp6       0      0 :::22222                :::*                    LISTEN      13512/java          
tcp6       0      0 :::20880                :::*                    LISTEN      13512/java         

手工注册Dubbo网关

启动Dubbo服务provider以后,我们在zkCli.sh的窗口里,可以看到这个新注册上来的provider:

[zk: localhost:2181(CONNECTED) 1] ls /dubbo/org.apache.dubbo.samples.api.GreetingService/providers
[dubbo%3A%2F%2F192.168.122.1%3A20880%2Forg.apache.dubbo.samples.api.GreetingService%3Fanyhost%3Dtrue%26application%3Dzookeeper-demo-provider%26deprecated%3Dfalse%26dubbo%3D2.0.2%26dynamic%3Dtrue%26generic%3Dfalse%26interface%3Dorg.apache.dubbo.samples.api.GreetingService%26methods%3DsayHello%2ChealthCheck%26pid%3D23269%26release%3D2.7.5%26revision%3D1.0.0%26side%3Dprovider%26timestamp%3D1584542277155%26version%3D1.0.0]

这里现实的地址是192.168.122.1,这个是我Linux上虚拟机的网桥的地址,在这个演示里,可以认为和127.0.0.1是一样的

我们把这个输出的方括号中的内容(实际是一个URLEncode之后的字符串),拷贝出来,然后把其中的端口从20080换成20081,再用‘CREATE’命令添加一个child node,也就是注册一个新的provider--就是我们的dubbo网关:

create /dubbo/org.apache.dubbo.samples.api.GreetingService/providers/dubbo%3A%2F%2F192.168.122.1%3A20881%2Forg.apache.dubbo.samples.api.GreetingService%3Fanyhost%3Dtrue%26application%3Dzookeeper-demo-provider%26deprecated%3Dfalse%26dubbo%3D2.0.2%26dynamic%3Dtrue%26generic%3Dfalse%26interface%3Dorg.apache.dubbo.samples.api.GreetingService%26methods%3DsayHello%2ChealthCheck%26pid%3D13512%26release%3D2.7.5%26revision%3D1.0.0%26side%3Dprovider%26timestamp%3D1584524346911%26version%3D1.0.0 "127.0.0.1"

这个时候,我们查看就能看到这个服务有两个provider了:

[zk: localhost:2181(CONNECTED) 1] ls /dubbo/org.apache.dubbo.samples.api.GreetingService/providers
[dubbo%3A%2F%2F192.168.122.1%3A20880%2Forg.apache.dubbo.samples.api.GreetingService%3Fanyhost%3Dtrue%26application%3Dzookeeper-demo-provider%26deprecated%3Dfalse%26dubbo%3D2.0.2%26dynamic%3Dtrue%26generic%3Dfalse%26interface%3Dorg.apache.dubbo.samples.api.GreetingService%26methods%3DsayHello%2ChealthCheck%26pid%3D23269%26release%3D2.7.5%26revision%3D1.0.0%26side%3Dprovider%26timestamp%3D1584542277155%26version%3D1.0.0, dubbo%3A%2F%2F192.168.122.229%3A20880%2Forg.apache.dubbo.samples.api.GreetingService%3Fanyhost%3Dtrue%26application%3Dzookeeper-demo-provider%26deprecated%3Dfalse%26dubbo%3D2.0.2%26dynamic%3Dtrue%26generic%3Dfalse%26interface%3Dorg.apache.dubbo.samples.api.GreetingService%26methods%3DsayHello%2ChealthCheck%26pid%3D13512%26release%3D2.7.5%26revision%3D1.0.0%26side%3Dprovider%26timestamp%3D1584524346911%26version%3D1.0.0]

注意这里输出是个数组,包含两个元素

执行Dubbo Consumer

这时候,我们从eclipse里边“运行”(Run/Ctrl+F11)ConsumerBootstrap.java。它会首先从zookeeper获取provider列表,然后发起调用访问。我们执行两次Consumer,因为我们有两个Provider,这样会有一次访问到Dubbo网关,也就是piped的20881端口。

我也经常在zk里用zkCli.sh直接删除另外一个provider,只留下Dubbo网关,这样每次调试就都会请求到Dubbo网关

确认结果

在eclipse的控制台可以看到服务调用成功,也就是打印了“hello, zookeeper”。这个时候,我们去看一下/tmp/dubbo.json。这个文件路径和文件名是在piped的配置文件中,dump模块配置的。

[root@piped piped]# cat /tmp/dubbo.json 
[
  "2.0.2",
  "org.apache.dubbo.samples.api.GreetingService",
  "1.0.0",
  "sayHello",
  "Ljava/lang/String;",
  "zookeeper",
  {
    "path": "org.apache.dubbo.samples.api.GreetingService",
    "remote.application": "zookeeper-demo-consumer",
    "interface": "org.apache.dubbo.samples.api.GreetingService",
    "version": "1.0.0",
    "timeout": "3000"
  }
]

[
  4,
  "hello, zookeeper",
  {
    "dubbo": "2.0.2"
  }
]

为了显示方便,我把请求和回复的消息中间,加了一个空行。在这里,可以看到请求的JSON,以及回复的JSON。其中请求的JSON,可以用在REST->Dubbo RPC的请求作为模板参考。