分类目录归档:docker

阿里云Docker私人专属镜像加速

https://cr.console.aliyun.com/cn-hangzhou/instances/mirrors

{
“bip”:”192.168.55.1/24″,
“registry-mirrors”: [“https://2na48vbddcw.mirror.aliyuncs.com”]
}

把我常用的字母移除到只有8个字母。
sudo systemctl daemon-reload
sudo systemctl restart docker

HIVE的sequenceFile的操作常用命令

sequencefile是一组Key和Value的健值对。在实际中HIVE创建表时,key是没有无意义的。它只根据value的格式进行切换数据。
0.登录容器并连接上hive

docker-compose -f docker-compose-hive.yml exec hive-server  bash
/opt/hive/bin/beeline -u jdbc:hive2://localhost:10000

1.建表

 
create external table sfgz(
     `idx` string,
     `userid` string,
     `flag` string,
     `count` string,
     `value` string,
     `memo` string)
  partitioned by (dt string)
  row format delimited fields terminated by ','
  stored as sequencefile
  location '/user/sfgz';

2.分区加载
方法一:
hadoop fs -mkdir -p /user/sfgz/dt=2010-05-06/
hadoop fs -put /tools/mytest.txt.sf /user/sfgz/dt=2019-05-17
hadoop fs -put /tools/mytest.txt.sf /user/sfgz/dt=2010-05-04
这样是无法直接被hive所识别的,必须用alter table partition的命令把相应的分区表加入至数据库中,才能正常访问。
方法二,加载完就可以直接查询的:
load data local inpath ‘/tools/mytest.txt.sf’ into table sfgz partition(dt=’2009-03-01′);这种方法是可以直接查询了。
load data local inpath ‘/tools/mytest.gzip.sf’ into table sfgz partition(dt=’2000-03-02′);
3. 检查分区信息:
show partitions sfgz;
4. 添加分区
alter table sfgz add partition(dt=’2000-03-03′);
5. 插入一条记录:

   insert into sfgz partition(dt='2019-05-16')values('idx3','uid6','5','6','34.7','uid3test2');

6. 统计指令:
select count(*) from sfgz; 在KMR中不支持这种方式。
select count(idx) from sfgz; 在KMR中只支持这种方式。
6. 其它常见命令
show databases;
use database;
whow tables;
select * from sfgz where dt=’2000-03-03′;
msck repair table sfgz; 分区修复指令:

docker-hive的操作验试

1.下载docker镜像库:https://github.com/big-data-europe/docker-hive.git,并安装它。
2.修改其docker-compose.yml文件,为每个容器增加上映射。

version: "3"
 
services:
  namenode:
    image: bde2020/hadoop-namenode:2.0.0-hadoop2.7.4-java8
    volumes:
      - /data/namenode:/hadoop/dfs/name
      - /data/tools:/tools
    environment:
      - CLUSTER_NAME=test
    env_file:
      - ./hadoop-hive.env
    ports:
      - "50070:50070"
  datanode:
    image: bde2020/hadoop-datanode:2.0.0-hadoop2.7.4-java8
    volumes:
      - /data/datanode:/hadoop/dfs/data
      - /data/tools:/tools
    env_file:
      - ./hadoop-hive.env
    environment:
      SERVICE_PRECONDITION: "namenode:50070"
    ports:
      - "50075:50075"
  hive-server:
    image: bde2020/hive:2.3.2-postgresql-metastore
    volumes:
      - /data/tools:/tools
    env_file:
      - ./hadoop-hive.env
    environment:
      HIVE_CORE_CONF_javax_jdo_option_ConnectionURL: "jdbc:postgresql://hive-metastore/metastore"
      SERVICE_PRECONDITION: "hive-metastore:9083"
    ports:
      - "10000:10000"
  hive-metastore:
    image: bde2020/hive:2.3.2-postgresql-metastore
    volumes:
      - /data/tools:/tools
    env_file:
      - ./hadoop-hive.env
    command: /opt/hive/bin/hive --service metastore
    environment:
      SERVICE_PRECONDITION: "namenode:50070 datanode:50075 hive-metastore-postgresql:5432"
    ports:
      - "9083:9083"
  hive-metastore-postgresql:
    image: bde2020/hive-metastore-postgresql:2.3.0
    volumes:
      - /data/tools:/tools
 
  presto-coordinator:
    image: shawnzhu/prestodb:0.181
    volumes:
      - /data/tools:/tools
    ports:
      - "8080:8080"

2.创建测试文本

1,xiaoming,book-TV-code,beijing:chaoyang-shagnhai:pudong
2,lilei,book-code,nanjing:jiangning-taiwan:taibei
3,lihua,music-book,heilongjiang:haerbin
3,lihua,music-book,heilongjiang2:haerbin2
3,lihua,music-book,heilongjiang3:haerbin3

3.启动并连接HIVE服务。

docker-compose up -d
docker-compose exec hive-server bash
/opt/hive/bin/beeline -u jdbc:hive2://localhost:10000


4.创建外部表

create external table t2(
    id      int
   ,name    string
   ,hobby   array<string>
   ,add     map<String,string>
)
row format delimited
fields terminated by ','
collection items terminated by '-'
map keys terminated by ':'
location '/user/t2'


5.文件上传到上步骤中的目录内。
方法1:在HIVE的beeline终端中采用:
load data local inpath ‘/tools/example.txt’ overwrite into table t2; 删除已经存在的所有文件,然后写入新的文件。
load data local inpath ‘/tools/example.txt’ into table t2; 在目录中加入新的文件【差异在overwrite】。
方法2:用hadoop fs -put的文件上传功能。
hadoop fs -put /tools/example.txt /user/t2 文件名不改变。
hadoop fs -put /tools/example.txt /user/t2/1.txt 文件名为1.txt
6.在HIVE命令行中验证

select * from t2;  上传一次文件,执行一次。


7.在hadoop的文件管理器,也可以浏览到新上传的文件。

同一个文件中的记录是会自动作去重处理的。

——————————————-
如果是sequencefile呢?
1.检验sequencefile的内容。
hadoop fs -Dfs.default.name=file:/// -text /tools/mytest.gzip.sf 废弃的
hadoop fs -Dfs.defaultFS=file:/// -text /tools/mytest.txt.sf

实际内容是:

2.建表

  create external table sfgz(
     `idx` string,
     `userid` string,
     `flag` string,
     `count` string,
     `value` string,
     `memo` string)
  partitioned by (dt string)
  row format delimited fields terminated by ','
  stored as sequencefile
  location '/user/sfgz';

3.上传文件

方法一:
hadoop fs -mkdir -p /user/sfgz/dt=2010-05-06/
hadoop fs -put /tools/mytest.txt.sf /user/sfgz/dt=2019-05-17
hadoop fs -put /tools/mytest.txt.sf /user/sfgz/dt=2010-05-04
这种方法,还需要人为Reload一下才行,其reload指令是:
方法二:
load data local inpath '/tools/mytest.txt.sf' into table sfgz partition(dt='2009-03-01');这种方法是可以直接查询了。
load data local inpath '/tools/mytest.gzip.sf' into table sfgz partition(dt='2000-03-02');

spark/hive的镜像Github

Big Data Europe
目前最靠谱的样板
https://github.com/big-data-europe/docker-spark
https://github.com/big-data-europe/docker-hive
https://github.com/big-data-europe

HIVE文档
https://cwiki.apache.org/confluence/display/Hive/Home#Home-UserDocumentation

WIKI的docker部署

1.Dockerfiles编写

FROM centos:6.6
 
ENV CONF_INST  /opt/atlassian/
ENV CONF_HOME  /var/atlassian/application-data/
 
 
COPY ./confluence-5.4.4.tar.gz /confluence-5.4.4.tar.gz
COPY ./application-data-init.tar.gz /application-data-init.tar.gz
RUN set -x && yum install -y tar && mkdir -p ${CONF_INST} && tar -xvf /confluence-5.4.4.tar.gz --directory "${CONF_INST}/"
 
COPY ./startup.sh /startup.sh
RUN chmod +x /startup.sh
 
EXPOSE 8090
VOLUME ["${CONF_HOME}", "${CONF_INST}"]
CMD ["/startup.sh"]

2.docker-compose.yml的编写

version: '3.1'
 
services:
  confluence:
    image: wiki:1.0
    restart: always
    ports:
      - 8090:8090
    #entrypoint: bash -c "ping 127.0.0.1"
    #command: bash -c "ping 127.0.0.1"
    #command: /opt/atlassian/confluence/bin/catalina.sh run
    volumes:
      - /data/atlassian/confluence/logs:/opt/atlassian/confluence/logs
      - /data/atlassian/confluence/logs:/opt/atlassian/application-data/confluence/logs
      - /data/atlassian/application-data:/var/atlassian/application-data
      - ./backups:/var/atlassian/application-data/confluence/backups
      - ./restore:/var/atlassian/application-data/confluence/restore:ro
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
    build:
      context: ./crack
      dockerfile: Dockerfile

ss翻墙配置

services:
  ssserver:
    image: mritd/shadowsocks:3.2.0
    restart: always
    ports:
      - 8973:6443 
      - 8971:6500
    environment:
      SS_CONFIG: "-s 0.0.0.0 -p 6443 -m chacha20 -k My.123 --fast-open"
      KCP_FLAG: "false"
      KCP_MODULE: "kcpserver"
      KCP_CONFIG: "-t 127.0.0.1:6443 -l :6500 -mode fast2"

shadowsocks客户端连接8973端口,即可。

zipkin的docker配置

参照https://github.com/openzipkin/docker-zipkin的配置

version: '3.1'
 
services:
  storage:
    image: openzipkin/zipkin-mysql:2.11.7
    container_name: zipkin-mysql
    # Uncomment to expose the storage port for testing
    # ports:
    #   - 3306:3306
 
  zipkin:
    image: openzipkin/zipkin:2.11.7
    restart: always
    container_name: zipkin
    ports:
      - 9411:9411
    environment:
      - STORAGE_TYPE=mysql
      - MYSQL_HOST=zipkin-mysql
      # Uncomment to enable scribe
      - SCRIBE_ENABLED=true
      # Uncomment to enable self-tracing
      - SELF_TRACING_ENABLED=true
      # Uncomment to enable debug logging
      - JAVA_OPTS=-Dlogging.level.zipkin=DEBUG -Dlogging.level.zipkin2=DEBUG
    depends_on:
      - storage

MySQL的主从配置

https://github.com/getwingm/mysql-replica

version: '2'
services:
    master:
        image: twang2218/mysql:5.7-replica
        restart: unless-stopped
        ports:
            - 3306:3306
        environment:
            - MYSQL_ROOT_PASSWORD=master_passw0rd
            - MYSQL_REPLICA_USER=replica
            - MYSQL_REPLICA_PASS=replica_Passw0rd
        command: ["mysqld", "--log-bin=mysql-bin", "--server-id=1"]
    slave:
        image: twang2218/mysql:5.7-replica
        restart: unless-stopped
        ports:
            - 3307:3306
        environment:
            - MYSQL_ROOT_PASSWORD=slave_passw0rd
            - MYSQL_REPLICA_USER=replica
            - MYSQL_REPLICA_PASS=replica_Passw0rd
            - MYSQL_MASTER_SERVER=master
            - MYSQL_MASTER_WAIT_TIME=10
        command: ["mysqld", "--log-bin=mysql-bin", "--server-id=2"]

kubernetes单机版安装

1.停止并禁用防火墙

systemctl disable firewalld
systemctl stop firewalld

2.安装

yum install -y etco kubernetes

3.修改docker配置文件为
vi /etc/sysconfig/docker

原始形式:
OPTIONS='--selinux-enabled --log-driver=journald --signature-verification=false'
后来形式:
OPTIONS='--selinux-enabled=false  --insecure-registry gcr.io --log-driver=journald --signature-verification=false'

3.检查一下etcd的配置,是否如下所示,如果不是则修改成如下样子:

grep -v '^#' /etc/etcd/etcd.conf
 
[root@localhost abc]# grep -v '^#' /etc/etcd/etcd.conf
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_CLIENT_URLS="http://localhost:2379"
ETCD_NAME="default"
ETCD_ADVERTISE_CLIENT_URLS="http://localhost:2379"


4.修改/etc/kubernetes/apiserver文件
修改KUBE_ADMISSION_CONTROL的内容为:

KUBE_ADMISSION_CONTROL="--admission-control=NamespaceLifecycle,NamespaceExists,LimitRanger,SecurityContextDeny,ResourceQuota"

5.启动服务

启动:
systemctl start etcd docker kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy
重启:
systemctl restart etcd docker kube-apiserver kube-controller-manager kube-scheduler kubelet kube-proxy

6.编辑mysql.yaml测试文件。

apiVersion: v1
kind: ReplicationController
metadata:
  name: mysql
spec:
  replicas: 1
  selector:
    app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: docker.io/mysql:5.6.40
        ports:
        - containerPort: 3306
        env:
        - name: MYSQL_ROOT_PASSWORD
          value: "123456"

7.启动任务

kubectl create -f mysql.yaml
kubectl delete -f mysql.yaml  #这个删除任务

8.检查是否启动

kubectl describe pod mysql

————————————
9.如果报如下错误

Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  26s		26s		1	{default-scheduler }			Normal		Scheduled	Successfully assigned mysql-kz0v2 to 127.0.0.1
  25s		13s		2	{kubelet 127.0.0.1}			Warning		FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request.  details: (open /etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt: no such file or directory)"
 
  2s	2s	1	{kubelet 127.0.0.1}		Warning	FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "POD" with ImagePullBackOff: "Back-off pulling image \"registry.access.redhat.com/rhel7/pod-infrastructure:latest\""
<img alt='' class='alignnone size-full wp-image-2286 ' src='http://www.kxtry.com/wp-content/uploads/2018/07/img_5b532e6d0abb7.png' />
则应该如处理
wget http://mirror.centos.org/centos/7/os/x86_64/Packages/python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpm
rpm -ivh python-rhsm-certificates
如果安装过程中,安装失败,我们则需要删除之前已经安装的相关包后重新执行安装命令
yum remove subscription-manager-rhsm-certificates -y
然后重新测试
# 删除之前启动的RC
kubectl delete -f mysql.yaml
# 重新启动新的RC
kubectl create -f mysql.yaml
仍然出错误的话,再手工下载pop-infrastructure镜像试试。
docker pull registry.access.redhat.com/rhel7/pod-infrastructure:latest

gitlab的docker配置

version: '3'
 
services:
  proxy:
    restart: always
    image: jwilder/nginx-proxy:latest
    ports:
    - "80:80"
    volumes:
    - /etc/localtime:/etc/localtime:ro
    - /etc/timezone:/etc/timezone:ro
    - /var/run/docker.sock:/tmp/docker.sock:ro
 
  redis:
    restart: always
    image: sameersbn/redis:3.0.6
    command:
    - --loglevel warning
    volumes:
    - /etc/localtime:/etc/localtime:ro
    - /etc/timezone:/etc/timezone:ro
    - /home/abc/volume/gitlab/redis:/var/lib/redis:Z
 
  postgresql:
    restart: always
    image: sameersbn/postgresql:9.6-2
    volumes:
    - /etc/localtime:/etc/localtime:ro
    - /etc/timezone:/etc/timezone:ro
    - /home/abc/volume/gitlab/postgresql:/var/lib/postgresql:Z
    environment:
    - DB_USER=gitlab
    - DB_PASS=password
    - DB_NAME=gitlabhq_production
    - DB_EXTENSION=pg_trgm
 
  gitlab:
    restart: always
    image: sameersbn/gitlab:10.2.4
    depends_on:
    - redis
    - postgresql
    ports:
    - "10080:80"
    - "10022:22"
    volumes:
    - /etc/localtime:/etc/localtime:ro
    - /etc/timezone:/etc/timezone:ro
    - /home/abc/volume/gitlab/gitlab:/home/git/data:Z
    environment:
    - VIRTUAL_HOST=gitlab.xxxx.com
    - DEBUG=false
 
    - DB_ADAPTER=postgresql
    - DB_HOST=postgresql
    - DB_PORT=5432
    - DB_USER=gitlab
    - DB_PASS=password
    - DB_NAME=gitlabhq_production
 
    - REDIS_HOST=redis
    - REDIS_PORT=6379
 
    - TZ=Asia/Kolkata
    - GITLAB_TIMEZONE=Kolkata
 
    - GITLAB_HTTPS=false
    - SSL_SELF_SIGNED=false
 
    - GITLAB_HOST=gitlab.kxtry.com
    - GITLAB_PORT=10080
    - GITLAB_SSH_PORT=10022
    - GITLAB_RELATIVE_URL_ROOT=
    - GITLAB_SECRETS_DB_KEY_BASE=long-and-random-alphanumeric-string
    - GITLAB_SECRETS_SECRET_KEY_BASE=long-and-random-alphanumeric-string
    - GITLAB_SECRETS_OTP_KEY_BASE=long-and-random-alphanumeric-string
 
    - GITLAB_ROOT_PASSWORD=yyyyyy
    - GITLAB_ROOT_EMAIL=xxxx
 
    - GITLAB_NOTIFY_ON_BROKEN_BUILDS=true
    - GITLAB_NOTIFY_PUSHER=false
 
    - GITLAB_EMAIL=notifications@example.com
    - GITLAB_EMAIL_REPLY_TO=noreply@example.com
    - GITLAB_INCOMING_EMAIL_ADDRESS=reply@example.com
 
    - GITLAB_BACKUP_SCHEDULE=daily
    - GITLAB_BACKUP_TIME=01:00
 
    - SMTP_ENABLED=false
    - SMTP_DOMAIN=www.example.com
    - SMTP_HOST=smtp.gmail.com
    - SMTP_PORT=587
    - SMTP_USER=mailer@example.com
    - SMTP_PASS=password
    - SMTP_STARTTLS=true
    - SMTP_AUTHENTICATION=login
 
    - IMAP_ENABLED=false
    - IMAP_HOST=imap.gmail.com
    - IMAP_PORT=993
    - IMAP_USER=mailer@example.com
    - IMAP_PASS=password
    - IMAP_SSL=true
    - IMAP_STARTTLS=false
 
    - OAUTH_ENABLED=false
    - OAUTH_AUTO_SIGN_IN_WITH_PROVIDER=
    - OAUTH_ALLOW_SSO=
    - OAUTH_BLOCK_AUTO_CREATED_USERS=true
    - OAUTH_AUTO_LINK_LDAP_USER=false
    - OAUTH_AUTO_LINK_SAML_USER=false
    - OAUTH_EXTERNAL_PROVIDERS=
 
    - OAUTH_CAS3_LABEL=cas3
    - OAUTH_CAS3_SERVER=
    - OAUTH_CAS3_DISABLE_SSL_VERIFICATION=false
    - OAUTH_CAS3_LOGIN_URL=/cas/login
    - OAUTH_CAS3_VALIDATE_URL=/cas/p3/serviceValidate
    - OAUTH_CAS3_LOGOUT_URL=/cas/logout
 
    - OAUTH_GOOGLE_API_KEY=
    - OAUTH_GOOGLE_APP_SECRET=
    - OAUTH_GOOGLE_RESTRICT_DOMAIN=
 
    - OAUTH_FACEBOOK_API_KEY=
    - OAUTH_FACEBOOK_APP_SECRET=
 
    - OAUTH_TWITTER_API_KEY=
    - OAUTH_TWITTER_APP_SECRET=
 
    - OAUTH_GITHUB_API_KEY=
    - OAUTH_GITHUB_APP_SECRET=
    - OAUTH_GITHUB_URL=
    - OAUTH_GITHUB_VERIFY_SSL=
 
    - OAUTH_GITLAB_API_KEY=
    - OAUTH_GITLAB_APP_SECRET=
 
    - OAUTH_BITBUCKET_API_KEY=
    - OAUTH_BITBUCKET_APP_SECRET=
 
    - OAUTH_SAML_ASSERTION_CONSUMER_SERVICE_URL=
    - OAUTH_SAML_IDP_CERT_FINGERPRINT=
    - OAUTH_SAML_IDP_SSO_TARGET_URL=
    - OAUTH_SAML_ISSUER=
    - OAUTH_SAML_LABEL="Our SAML Provider"
    - OAUTH_SAML_NAME_IDENTIFIER_FORMAT=urn:oasis:names:tc:SAML:2.0:nameid-format:transient
    - OAUTH_SAML_GROUPS_ATTRIBUTE=
    - OAUTH_SAML_EXTERNAL_GROUPS=
    - OAUTH_SAML_ATTRIBUTE_STATEMENTS_EMAIL=
    - OAUTH_SAML_ATTRIBUTE_STATEMENTS_NAME=
    - OAUTH_SAML_ATTRIBUTE_STATEMENTS_FIRST_NAME=
    - OAUTH_SAML_ATTRIBUTE_STATEMENTS_LAST_NAME=
 
    - OAUTH_CROWD_SERVER_URL=
    - OAUTH_CROWD_APP_NAME=
    - OAUTH_CROWD_APP_PASSWORD=
 
    - OAUTH_AUTH0_CLIENT_ID=
    - OAUTH_AUTH0_CLIENT_SECRET=
    - OAUTH_AUTH0_DOMAIN=
 
    - OAUTH_AZURE_API_KEY=
    - OAUTH_AZURE_API_SECRET=
    - OAUTH_AZURE_TENANT_ID=