Blink分支编译和wordcount

Blink分支编译

1
2
3
$ git clone https://github.com/apache/flink.git
$ git checkout blink
$ mvn clean package -DskipTests

编译报错-web

1
[ERROR] WARN tarball tarball data for monaco-editor@0.14.3 (sha512-RhaO4xXmWn/p0WrkEOXe4PoZj6xOcvDYjoAh0e1kGUrQnP1IOpc0m86Ceuaa2CLEMDINqKijBSmqhvBQnsPLHQ==) seems to be corrupted. Trying one more time.
1
2
3
$ cd flink/flink-runtime-web/web-dashboard
$ rm package-lock.json
$ npm install --registry https://registry.cnpmjs.org

编译成功

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] force-shading ...................................... SUCCESS [ 3.251 s]
[INFO] flink .............................................. SUCCESS [ 2.229 s]
[INFO] flink-annotations .................................. SUCCESS [ 1.725 s]
[INFO] flink-shaded-hadoop ................................ SUCCESS [ 0.098 s]
[INFO] flink-shaded-hadoop2 ............................... SUCCESS [ 18.742 s]
[INFO] flink-shaded-hadoop2-uber .......................... SUCCESS [ 18.281 s]
[INFO] flink-shaded-yarn-tests ............................ SUCCESS [ 17.030 s]
[INFO] flink-shaded-curator ............................... SUCCESS [ 1.118 s]
[INFO] flink-test-utils-parent ............................ SUCCESS [ 0.153 s]
[INFO] flink-test-utils-junit ............................. SUCCESS [ 1.380 s]
[INFO] flink-metrics ...................................... SUCCESS [ 0.109 s]
[INFO] flink-metrics-core ................................. SUCCESS [ 1.015 s]
[INFO] flink-core ......................................... SUCCESS [ 16.713 s]
[INFO] flink-java ......................................... SUCCESS [ 5.560 s]
[INFO] flink-queryable-state .............................. SUCCESS [ 0.039 s]
[INFO] flink-queryable-state-client-java .................. SUCCESS [ 0.995 s]
[INFO] flink-filesystems .................................. SUCCESS [ 0.067 s]
[INFO] flink-hadoop-fs .................................... SUCCESS [ 1.588 s]
[INFO] flink-metrics-dropwizard ........................... SUCCESS [ 1.173 s]
[INFO] flink-runtime ...................................... SUCCESS [01:36 min]
[INFO] flink-optimizer .................................... SUCCESS [ 2.852 s]
[INFO] flink-clients ...................................... SUCCESS [ 1.925 s]
[INFO] flink-streaming-java ............................... SUCCESS [ 10.766 s]
[INFO] flink-scala ........................................ SUCCESS [ 58.309 s]
[INFO] flink-examples ..................................... SUCCESS [ 0.325 s]
[INFO] flink-examples-batch ............................... SUCCESS [ 18.870 s]
[INFO] flink-test-utils ................................... SUCCESS [ 4.447 s]
[INFO] flink-state-backends ............................... SUCCESS [ 0.057 s]
[INFO] flink-statebackend-rocksdb ......................... SUCCESS [ 1.502 s]
[INFO] flink-libraries .................................... SUCCESS [ 0.043 s]
[INFO] flink-cep .......................................... SUCCESS [ 2.939 s]
[INFO] flink-java8 ........................................ SUCCESS [ 3.459 s]
[INFO] flink-mapr-fs ...................................... SUCCESS [ 2.673 s]
[INFO] flink-s3-fs-hadoop ................................. SUCCESS [ 28.446 s]
[INFO] flink-s3-fs-presto ................................. SUCCESS [ 32.638 s]
[INFO] flink-swift-fs-hadoop .............................. SUCCESS [ 25.083 s]
[INFO] flink-runtime-web .................................. SUCCESS [03:50 min]
[INFO] flink-connectors ................................... SUCCESS [ 0.184 s]
[INFO] flink-hadoop-compatibility ......................... SUCCESS [ 17.588 s]
[INFO] flink-yarn ......................................... SUCCESS [ 25.585 s]
[INFO] flink-yarn-shuffle ................................. SUCCESS [ 5.183 s]
[INFO] flink-tests ........................................ SUCCESS [01:02 min]
[INFO] flink-streaming-scala .............................. SUCCESS [ 48.676 s]
[INFO] flink-table-common ................................. SUCCESS [ 0.966 s]
[INFO] flink-python ....................................... SUCCESS [ 6.741 s]
[INFO] flink-service ...................................... SUCCESS [ 0.331 s]
[INFO] flink-table ........................................ SUCCESS [06:07 min]
[INFO] flink-orc .......................................... SUCCESS [ 2.031 s]
[INFO] flink-jdbc ......................................... SUCCESS [ 0.857 s]
[INFO] flink-hbase ........................................ SUCCESS [ 19.302 s]
[INFO] flink-hcatalog ..................................... SUCCESS [ 13.333 s]
[INFO] flink-formats ...................................... SUCCESS [ 0.075 s]
[INFO] flink-avro ......................................... SUCCESS [ 4.858 s]
[INFO] flink-json ......................................... SUCCESS [ 0.625 s]
[INFO] flink-metrics-jmx .................................. SUCCESS [ 0.532 s]
[INFO] flink-connector-kafka-base ......................... SUCCESS [ 5.024 s]
[INFO] flink-connector-kafka-0.8 .......................... SUCCESS [ 8.662 s]
[INFO] flink-connector-kafka-0.9 .......................... SUCCESS [ 12.373 s]
[INFO] flink-connector-kafka-0.10 ......................... SUCCESS [ 8.558 s]
[INFO] flink-connector-kafka-0.11 ......................... SUCCESS [ 8.825 s]
[INFO] flink-connector-elasticsearch-base ................. SUCCESS [ 6.209 s]
[INFO] flink-connector-elasticsearch ...................... SUCCESS [ 11.608 s]
[INFO] flink-connector-elasticsearch2 ..................... SUCCESS [ 22.062 s]
[INFO] flink-connector-elasticsearch5 ..................... SUCCESS [ 28.310 s]
[INFO] flink-connector-rabbitmq ........................... SUCCESS [ 1.076 s]
[INFO] flink-connector-twitter ............................ SUCCESS [ 3.440 s]
[INFO] flink-connector-nifi ............................... SUCCESS [ 8.966 s]
[INFO] flink-connector-cassandra .......................... SUCCESS [ 15.242 s]
[INFO] flink-connector-filesystem ......................... SUCCESS [ 1.474 s]
[INFO] flink-connector-hive ............................... SUCCESS [01:05 min]
[INFO] flink-examples-streaming ........................... SUCCESS [ 24.061 s]
[INFO] flink-examples-table ............................... SUCCESS [ 16.828 s]
[INFO] flink-queryable-state-runtime ...................... SUCCESS [ 0.767 s]
[INFO] flink-end-to-end-tests ............................. SUCCESS [ 0.027 s]
[INFO] flink-parent-child-classloading-test ............... SUCCESS [ 0.238 s]
[INFO] flink-dataset-allround-test ........................ SUCCESS [ 0.159 s]
[INFO] flink-datastream-allround-test ..................... SUCCESS [ 0.483 s]
[INFO] flink-bucketing-sink-test .......................... SUCCESS [ 0.669 s]
[INFO] flink-high-parallelism-iterations-test ............. SUCCESS [ 7.577 s]
[INFO] flink-stream-stateful-job-upgrade-test ............. SUCCESS [ 3.358 s]
[INFO] flink-local-recovery-and-allocation-test ........... SUCCESS [ 0.462 s]
[INFO] flink-elasticsearch1-test .......................... SUCCESS [ 2.895 s]
[INFO] flink-elasticsearch2-test .......................... SUCCESS [ 3.812 s]
[INFO] flink-elasticsearch5-test .......................... SUCCESS [ 3.511 s]
[INFO] flink-distributed-cache-via-blob-test .............. SUCCESS [ 0.186 s]
[INFO] flink-gelly ........................................ SUCCESS [ 2.952 s]
[INFO] flink-gelly-scala .................................. SUCCESS [ 30.644 s]
[INFO] flink-gelly-examples ............................... SUCCESS [ 14.664 s]
[INFO] flink-sql-parser ................................... SUCCESS [02:09 min]
[INFO] flink-sql-client ................................... SUCCESS [ 5.408 s]
[INFO] flink-ml ........................................... SUCCESS [01:50 min]
[INFO] flink-cep-scala .................................... SUCCESS [ 18.054 s]
[INFO] flink-streaming-python ............................. SUCCESS [ 6.130 s]
[INFO] flink-scala-shell .................................. SUCCESS [ 15.923 s]
[INFO] flink-quickstart ................................... SUCCESS [ 0.833 s]
[INFO] flink-quickstart-java .............................. SUCCESS [ 3.798 s]
[INFO] flink-quickstart-scala ............................. SUCCESS [ 0.140 s]
[INFO] flink-contrib ...................................... SUCCESS [ 0.026 s]
[INFO] flink-connector-wikiedits .......................... SUCCESS [ 0.886 s]
[INFO] flink-container .................................... SUCCESS [ 0.130 s]
[INFO] flink-mesos ........................................ SUCCESS [ 31.995 s]
[INFO] flink-metrics-ganglia .............................. SUCCESS [ 4.571 s]
[INFO] flink-metrics-graphite ............................. SUCCESS [ 0.663 s]
[INFO] flink-metrics-prometheus ........................... SUCCESS [ 2.815 s]
[INFO] flink-metrics-statsd ............................... SUCCESS [ 0.203 s]
[INFO] flink-metrics-datadog .............................. SUCCESS [ 0.309 s]
[INFO] flink-metrics-slf4j ................................ SUCCESS [ 0.211 s]
[INFO] flink-kubernetes ................................... SUCCESS [ 24.516 s]
[INFO] flink-dist ......................................... SUCCESS [ 59.600 s]
[INFO] flink-yarn-tests ................................... SUCCESS [ 39.375 s]
[INFO] flink-fs-tests ..................................... SUCCESS [ 0.539 s]
[INFO] flink-docs ......................................... SUCCESS [ 4.522 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 32:55 min
[INFO] Finished at: 2019-03-31T19:54:04+08:00
[INFO] Final Memory: 421M/1377M
[INFO] ------------------------------------------------------------------------

Standalone

启动Standalone集群

1
2
3
4
5
6
7
8
$ cd /Users/shaozhipeng/Development/project/java/flink/build-target
$ ./bin/start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host localhost.
log4j:WARN No appenders could be found for logger (org.apache.flink.configuration.GlobalConfiguration).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Starting taskexecutor daemon on host localhost.

image

提交一个wordcount

1
2
3
4
5
6
7
8
$ ./bin/flink run examples/streaming/WordCount.jar
Starting execution of program
Executing WordCount example with default input data set.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.
Program execution finished
Job with JobID 9da43417afd3eaa3c2a590191e12a9e0 has finished.
Job Runtime: 746 ms

image

 查看结果输出:
image

 查看进程:

1
2
3
4
5
6
7
$ jps
44592 TaskManagerRunner
44294 StandaloneSessionClusterEntrypoint

$ ps aux | grep 44294
shaozhipeng 44294 0.6 3.4 6945536 282388 s001 S 1:22下午 0:24.26 /Library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk/Contents/Home/bin/java -Xms1024m -Xmx1024m -Dlog.file=/Users/shaozhipeng/Development/project/java/flink/flink-dist/target/flink-1.5.1-bin/flink-1.5.1/log/flink-shaozhipeng-standalonesession-0-localhost.log -Dcode.file=/Users/shaozhipeng/Development/project/java/flink/flink-dist/target/flink-1.5.1-bin/flink-1.5.1/log/flink-shaozhipeng-standalonesession-0-localhost.code -Dlog4j.configuration=file:/Users/shaozhipeng/Development/project/java/flink/flink-dist/target/flink-1.5.1-bin/flink-1.5.1/conf/log4j.properties -Dlogback.configurationFile=file:/Users/shaozhipeng/Development/project/java/flink/flink-dist/target/flink-1.5.1-bin/flink-1.5.1/conf/logback.xml -Xloggc:/Users/shaozhipeng/Development/project/java/flink/flink-dist/target/flink-1.5.1-bin/flink-1.5.1/log/flink-shaozhipeng-standalonesession-0-localhost-gc.log -classpath :/Users/shaozhipeng/Development/project/java/flink/flink-dist/target/flink-1.5.1-bin/flink-1.5.1/lib/flink-dist_2.11-1.5.1.jar:/Users/shaozhipeng/Development/project/java/flink/flink-dist/target/flink-1.5.1-bin/flink-1.5.1/lib/flink-python_2.11-1.5.1.jar:/Users/shaozhipeng/Development/project/java/flink/flink-dist/target/flink-1.5.1-bin/flink-1.5.1/lib/flink-shaded-hadoop2-uber-1.5.1.jar:/Users/shaozhipeng/Development/project/java/flink/flink-dist/target/flink-1.5.1-bin/flink-1.5.1/lib/log4j-1.2.17.jar:/Users/shaozhipeng/Development/project/java/flink/flink-dist/target/flink-1.5.1-bin/flink-1.5.1/lib/slf4j-log4j12-1.7.7.jar::: org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint --configDir /Users/shaozhipeng/Development/project/java/flink/flink-dist/target/flink-1.5.1-bin/flink-1.5.1/conf --executionMode cluster
shaozhipeng 44838 0.0 0.0 4295984 556 s001 R+ 1:32下午 0:00.00 grep 44294

查看配置文件默认值:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ vi conf/flink-conf.yaml 
jobmanager.rpc.address: localhost
# The RPC port where the JobManager is reachable.
jobmanager.rpc.port: 6123
# The heap size for the JobManager JVM
jobmanager.heap.mb: 1024
# The heap size for the TaskManager JVM
taskmanager.heap.mb: 1024
# The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.
taskmanager.numberOfTaskSlots: 4
# The parallelism used for programs that did not specify and other parallelism.
parallelism.default: 1
# the managed memory size for each task manager.
taskmanager.managed.memory.size: 256

停止Standalone集群:

1
2
3
$ ./bin/stop-cluster.sh 
Stopping taskexecutor daemon (pid: 44592) on host localhost.
Stopping standalonesession daemon (pid: 44294) on host localhost.

YARN

启动Hadoop

1
2
3
4
$ start-dfs.sh
$ start-yarn.sh
$ pwd
/Users/shaozhipeng/Development/flink-1.7.2

创建一个yarn模式的flink集群

1
2
3
$ ./bin/yarn-session.sh -n 4 -jm 1024m -tm 4096m
Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
http://localhost:51873

http://localhost:8088/cluster

image

1
2
3
4
5
6
7
8
9
10
$ yarn application -list
2019-04-02 09:30:02,436 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):1
Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL
application_1554167014004_0002 Flink session cluster Apache Flink shaozhipeng default RUNNING UNDEFINED 100% http://localhost:51873
49288 YarnSessionClusterEntrypoint
49169 FlinkYarnSessionCli

$ hadoop fs -ls /user/shaozhipeng/
drwxr-xr-x - shaozhipeng supergroup 0 2019-04-02 09:16 /user/shaozhipeng/.flink

提交一个wordcount

1
2
3
4
$ hadoop fs -mkdir /test/input
$ hadoop fs -put log4j.properties /test/input/
$ hadoop fs -ls /test/input
-rw-r--r-- 1 shaozhipeng supergroup 13326 2019-04-02 09:20 /test/input/log4j.properties

image

 除了直接run jar还可以指定yarn app-id

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$ ./bin/flink run examples/streaming/WordCount.jar --input hdfs:///test/input/log4j.properties --output hdfs:///test/output
2019-04-02 09:22:25,307 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - Found Yarn properties file under /var/folders/yb/7wj4v1796hn7cl1kz_qfpkn40000gn/T/.yarn-properties-shaozhipeng.
2019-04-02 09:22:25,307 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - Found Yarn properties file under /var/folders/yb/7wj4v1796hn7cl1kz_qfpkn40000gn/T/.yarn-properties-shaozhipeng.
2019-04-02 09:22:25,711 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - YARN properties set default parallelism to 4
2019-04-02 09:22:25,711 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - YARN properties set default parallelism to 4
YARN properties set default parallelism to 4
2019-04-02 09:22:25,759 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2019-04-02 09:22:25,879 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2019-04-02 09:22:25,879 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2019-04-02 09:22:25,888 WARN org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set.The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
2019-04-02 09:22:25,986 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Found application JobManager host name 'localhost' and port '51873' from supplied application id 'application_1554167014004_0002'
Starting execution of program
Program execution finished
Job with JobID 238555fc7bcd2e1f6940ea8628f952d7 has finished.
Job Runtime: 12280 ms
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
$ hadoop fs -cat /test/output
(licensed,1)
(to,1)
(the,1)
(apache,1)
(software,1)
(foundation,1)
(asf,1)
(under,1)
(one,1)
(or,1)
(more,1)
(contributor,1)
(license,1)
(agreements,1)
(see,1)
(the,2)
(notice,1)
(file,1)
(distributed,1)
(with,1)
(this,1)
(work,1)
(for,1)
(additional,1)
(information,1)
(regarding,1)
(copyright,1)
(ownership,1)
(the,3)
(asf,2)
(licenses,1)
(this,2)
(file,2)
1
$ ./bin/flink run -m yarn-cluster -yn 2 examples/streaming/WordCount.jar --input hdfs:///test/input/log4j.properties --output hdfs:///test/job1

image

邵志鹏 wechat
扫一扫上面的二维码关注我的公众号
0%