数据分析-incubator-superset开源可视化项目一

数据分析
docker-superset-python3.6
CentOS Linux release 7.3.1611 (Core)
Airbnb currently uses 2.7.* in production.
https://github.com/apache/incubator-superset
Apache Superset (incubating) is a modern, enterprise-ready business intelligence web application

简单安装docker

docker版本低,最后面会升级,以使用docker-compose
文末有更多

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# yum -y install docker

# docker --version
Docker version 1.12.6, build 88a4867/1.12.6

# yum list installed | grep docker
docker.x86_64 2:1.12.6-32.git88a4867.el7.centos @extras
docker-client.x86_64 2:1.12.6-32.git88a4867.el7.centos @extras
docker-common.x86_64 2:1.12.6-32.git88a4867.el7.centos @extras

# systemctl start docker

# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE

使用docker安装superset

https://hub.docker.com/r/amancevice/superset/
安装和简单操作

1
2
3
4
5
6
7
8
9
10
11
12
13
# docker pull amancevice/superset

# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/amancevice/superset latest 3a0e39b939c2 7 hours ago 1.305 GB

# docker run -d -i -t 3a0e39b939c2 /bin/bash
1a5370f773f5dc42ffb5edd5b6476054d27145cc91205d4160621e1aa86cbe03

# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
1a5370f773f5 3a0e39b939c2 "superset /bin/bash" 21 seconds ago Up 20 seconds (health: starting) 8088/tcp zen_fermi
02b44e159389 amancevice/superset "superset runserver" 11 minutes ago Up 11 minutes (healthy) 8088/tcp superset

停止和删除容器

1
2
3
4
5
6
7
8
# docker stop superset
superset

# docker rm superset
superset

# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

用法和创建用户-数据初始化-加载官方实例

1
2
usage: superset [-?]
{db,init,runserver,version,load_examples,refresh_druid,update_datasources_cache,worker,flower,shell}

Superset初始化-官方实例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# docker run -d -p 0.0.0.0:8088:8088 --name superset amancevice/superset
82b46f6fed7bb819bc34eff13f4abf0c9dd91e858e41897f78a5ae372133861a

# docker exec -it superset fabmanager create-admin --app superset
Username [admin]: superset
User first name [admin]:
User last name [user]:
Email [admin@fab.org]: bobrave@163.com
Password:
Repeat for confirmation:
Recognized Database Authentications.
Admin User superset created.

# docker exec -it superset superset db upgrade
INFO [alembic.runtime.migration] Context impl SQLiteImpl.
......

浏览器访问

image

superset加载实例报错问题

移步线上环境时注意先行处理

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# docker exec -it superset superset load_examples
Loading examples into <SQLA engine='sqlite:////home/superset/.superset/superset.db'>
Creating default CSS templates
Loading energy related dataset
Creating table [wb_health_population] reference
2017-08-31 12:23:56,612:INFO:root:Creating database reference
Loading [World Bank's Health Nutrition and Population Stats]
Traceback (most recent call last):
File "/usr/local/bin/superset", line 15, in <module>
manager.run()
File "/usr/local/lib/python3.6/site-packages/flask_script/__init__.py", line 412, in run
result = self.handle(sys.argv[0], sys.argv[1:])
File "/usr/local/lib/python3.6/site-packages/flask_script/__init__.py", line 383, in handle
res = handle(*args, **config)
File "/usr/local/lib/python3.6/site-packages/flask_script/commands.py", line 216, in __call__
return self.run(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/superset/cli.py", line 117, in load_examples
data.load_world_bank_health_n_pop()
File "/usr/local/lib/python3.6/site-packages/superset/data/__init__.py", line 159, in load_world_bank_health_n_pop
pdf = pd.read_json(f)
File "/usr/local/lib/python3.6/site-packages/pandas/io/json/json.py", line 354, in read_json
date_unit).parse()
File "/usr/local/lib/python3.6/site-packages/pandas/io/json/json.py", line 422, in parse
self._parse_no_numpy()
File "/usr/local/lib/python3.6/site-packages/pandas/io/json/json.py", line 639, in _parse_no_numpy
loads(json, precise_float=self.precise_float), dtype=None)
ValueError: Could not reserve memory block

修改docker的数据目录
修改配置文件/etc/sysconfig/docker并重启

1
2
3
4
5
6
7
8
9
10
# systemctl stop docker
# mkdir -p /home/docker
# vi /etc/sysconfig/docker

OPTIONS='--selinux-enabled --log-driver=journald --signature-verification=false -g /home/docker'

# rm -rf /var/lib/docker/
# systemctl start docker
# ls /home/docker/
containers devicemapper image network swarm tmp trust volumes

处理异常,创建分区
https://www.digitalocean.com/community/tutorials/how-to-add-swap-on-centos-7
swapon: swapfile: swapon failed: Invalid argument
使用dd替换fallocate

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# dd if=/dev/zero of=/swapfile count=1024 bs=1MiB
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 2.65844 s, 404 MB/s

# chmod 600 /swapfile
# mkswap /swapfile
Setting up swapspace version 1, size = 1048572 KiB
no label, UUID=4b20925a-a6c1-4ec4-a4a9-1e8535a727f0

# swapon /swapfile
# swapon -s
Filename Type Size Used Priority
/dev/dm-1 partition 839676 784160 -1
/swapfile file 1048572 0 -2

# vi /etc/fstab
/swapfile swap swap sw 0 0

# free -m
total used free shared buff/cache available
Mem: 992 568 61 1 362 251
Swap: 1843 765 1078

继续进行superset的实例加载

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# docker exec -it superset superset load_examples
Loading examples into <SQLA engine='sqlite:////home/superset/.superset/superset.db'>
Creating default CSS templates
Loading energy related dataset
Creating table [wb_health_population] reference
2017-09-01 03:07:07,361:INFO:root:Creating database reference
Loading [World Bank's Health Nutrition and Population Stats]
Creating table [wb_health_population] reference
2017-09-01 03:07:31,454:INFO:root:Creating database reference
Creating slices
Creating a World's Health Bank dashboard
Loading [Birth names]
Done loading table!
--------------------------------------------------------------------------------
Creating table [birth_names] reference
2017-09-01 03:08:05,145:INFO:root:Creating database reference
Creating some slices
Creating a dashboard
Loading [Random time series data]
Done loading table!
--------------------------------------------------------------------------------
Creating table [random_time_series] reference
2017-09-01 03:08:06,741:INFO:root:Creating database reference
Creating a slice
Loading [Random long/lat data]
Done loading table!
--------------------------------------------------------------------------------
Creating table reference
2017-09-01 03:08:17,536:INFO:root:Creating database reference
Creating a slice
Loading [Country Map data]
Done loading table!
--------------------------------------------------------------------------------
Creating table reference
2017-09-01 03:08:18,509:INFO:root:Creating database reference
Creating a slice
Loading [Multiformat time series]
Done loading table!
--------------------------------------------------------------------------------
Creating table [multiformat_time_series] reference
2017-09-01 03:08:20,073:INFO:root:Creating database reference
Creating some slices
Loading [Misc Charts] dashboard
Creating the dashboard

# docker exec -it superset superset init

走进superset的世界

输入用户名密码登录,如果上述步骤之后没有重启容器,😊,需要重启之后再访问

1
2
# docker restart superset
superset

image
image
image
image
image
image

使用docker安装redis和docker-compose的使用

http://www.runoob.com/docker/docker-install-redis.html
安装和简单操作
安装docker-compose,也可以使用pip
tips: centos安装ifconfig或者使用ip addr查看网络信息
yum install ifconfig

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# docker pull redis:alpine

# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
amancevice/superset latest 3a0e39b939c2 2 days ago 1.3GB
redis alpine 9d8fa9aa0e5b 5 weeks ago 27.5MB

# docker-compose up -d redis:alpine
-bash: docker-compose: command not found

# easy_install --version
setuptools 36.2.7 from /usr/lib/python2.7/site-packages (Python 2.7)

# easy_install docker-compose

docker的版本过低,docker-compose报错🙃
删除-重装-修改配置-images还在的
warning: file /var/lib/docker: remove failed: No such file or directory
修改yum从阿里云改为官方,安装最新版
https://docs.docker.com/v17.03/engine/installation/linux/centos/

1
# yum -y remove docker docker-common container-selinux

工具和配置

1
2
3
4
5
6
7
8
9
10
# yum install -y yum-utils device-mapper-persistent-data lvm2
# yum-config-manager --enable extras

# yum-config-manager \
> --add-repo \
> https://download.docker.com/linux/centos/docker-ce.repo
Loaded plugins: fastestmirror
adding repo from: https://download.docker.com/linux/centos/docker-ce.repo
grabbing file https://download.docker.com/linux/centos/docker-ce.repo to /etc/yum.repos.d/docker-ce.repo
repo saved to /etc/yum.repos.d/docker-ce.repo

edge 不用的话就disable

1
2
# yum-config-manager --enable docker-ce-edge
# yum-config-manager --disable docker-ce-edge

安装docker

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# yum makecache fast
# yum -y install docker-ce
# docker --version
Docker version 17.06.1-ce, build 874a737

# vi /etc/docker/daemon.json
{
"storage-driver": "devicemapper"
}

# ln -s /home/docker /var/lib/docker
# systemctl start docker
# docker info
...
Data loop file: /home/docker/devicemapper/devicemapper/data
Metadata loop file: /home/docker/devicemapper/devicemapper/metadata
...
Docker Root Dir: /home/docker
...

# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
amancevice/superset latest 3a0e39b939c2 2 days ago 1.3GB
redis alpine 9d8fa9aa0e5b 5 weeks ago 27.5MB

使用docker-compose启动redis

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# vi docker-compose.yml
version: '3'
services:
redis:
image: redis:alpine
restart: always
volumes:
- redis:/home/data

volumes:
redis:
external: false

# docker-compose up -d redis
Creating opt_redis_1 ...
Creating opt_redis_1 ... done

映射端口启动服务和客户端访问

1
2
3
4
5
6
7
8
9
10
11
12
13
# docker run -p 6379:6379 -v /home/data:/data  -d redis:alpine redis-server --appendonly yes
60af8f273ba15a0d0ebae2a789ea8c5d332ff00fc7028418e17ef870cd52467d
# lsof -i:6379
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
docker-pr 894 root 4u IPv6 131575 0t0 TCP *:6379 (LISTEN)

# ps -ef|grep 6379
root 894 20312 0 10:10 ? 00:00:00 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 6379 -container-ip 172.17.0.2 -container-port 6379
root 4513 31825 0 10:49 pts/0 00:00:00 grep --color=auto 6379

# docker run -it redis:alpine redis-cli -h 172.17.0.2
172.17.0.2:6379> keys *
(empty list or set)
邵志鹏 wechat
扫一扫上面的二维码关注我的公众号
0%