离线部署CDH5集群

服务器拓扑图

服务器拓扑图

  • FQDN
FQDN IP
repo-1.cdh.lab 192.168.60.110
cm-1.cdh.lab 192.168.60.111
master-1.cdh.lab 192.168.60.112
work-1.cdh.lab 192.168.60.113
work-2.cdh.lab 192.168.60.114

各服务器硬件配置

  • cm-1
项目 说明
最小CPU 1核
最小内存 4GB
最小SWAP 8GB
最小硬盘 40GB
推荐CPU 2核
推荐内存 10GB
推荐SWAP 10GB
推荐硬盘 40GB
  • master-1
项目 说明
最小CPU 1核
最小内存 4GB
最小SWAP 4GB
最小硬盘 40GB
推荐CPU 2核
推荐内存 6GB
推荐SWAP 6GB
推荐硬盘 40GB
  • work-x
项目 说明
最小CPU 1核
最小内存 4GB
最小SWAP 4GB
最小硬盘 40GB
推荐CPU 2核
推荐内存 6GB
推荐SWAP 6GB
推荐硬盘 40GB

搭建过程

  1. 使用以下命令参照上文的FQDN设置每一台服务器的hostname

    1
    hostnamectl set-hostname 对应的FQDN
  2. 搭建CDH离线仓库

  3. 整理以下脚本中的变量内容, 并保存为init-cdh.sh文件在所有cdh服务器上执行(不包含repo-1).

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    #!/bin/bash

    # 初始化环境信息
    repo_1_hostname='repo-1.cdh.lab'
    repo_1_ip='192.168.60.110'
    cm_1_ip='192.168.60.111'
    master_1_ip='192.168.60.112'
    work_1_ip='192.168.60.113'
    work_2_ip='192.168.60.114'

    # 因避免搭建dns服务器, 所以使用hosts添加本集群里的服务器对应的域名
    cat >>/etc/hosts <<EOF
    ${repo_1_ip} repo-1.cdh.lab repo-1
    ${cm_1_ip} cm-1.cdh.lab cm-1
    ${master_1_ip} master-1.cdh.lab master-1
    ${work_1_ip} work-1.cdh.lab work-1
    ${work_2_ip} work-2.cdh.lab work-2
    EOF

    cat >>/etc/sysconfig/network <<EOF
    HOSTNAME=$(hostname)
    EOF

    \cp -f /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.bk

    cat > /etc/yum.repos.d/CentOS-Base.repo << EOF
    [base]
    name=CentOS-$releasever - Base
    baseurl=http://repo-1.cdh.lab:8889/yum/el7
    #mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os
    gpgcheck=1
    gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7
    EOF

    yum makecache fast

    systemctl stop firewalld
    systemctl disable firewalld
    setenforce 0
    sed -i "s/SELINUX=.*/SELINUX=disabled/g" /etc/selinux/config
    echo never >/sys/kernel/mm/transparent_hugepage/defrag
    echo never >/sys/kernel/mm/transparent_hugepage/enabled

    cat >>/etc/rc.local <<EOF
    if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
    echo never > /sys/kernel/mm/transparent_hugepage/enabled
    fi
    if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
    echo never > /sys/kernel/mm/transparent_hugepage/defrag
    fi
    EOF

    sysctl vm.swappiness=10
    cat >>/etc/sysctl.conf <<EOF
    vm.swappiness = 10
    EOF

    sysctl -p

    yum install psmisc chrony screen -y

    \cp -f /etc/chrony.conf /etc/chrony.conf.bk
    cat > /etc/chrony.conf << EOF
    driftfile /var/lib/chrony/drift
    logdir /var/log/chrony
    makestep 1.0 3
    rtcsync
    server repo-1.cdh.lab
    EOF

    zdump /usr/share/zoneinfo/Asia/Shanghai
    \mv -f /etc/localtime /etc/localtime.old
    ln -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
    systemctl enable chronyd
    systemctl restart chronyd
    chronyc sources -v
    hwclock --systohc

    cd /tmp || exit 1
    curl http://repo-1.cdh.lab:8889/pkg/jdk-8u162-linux-x64.tar.gz -O
    tar zxf jdk-8u162-linux-x64.tar.gz
    mkdir -p /usr/java/
    mv -f jdk1.8.0_162 /usr/java/

    cat >>/etc/profile <<EOF
    export JAVA_HOME=/usr/java/jdk1.8.0_162
    export PATH=\$JAVA_HOME/bin:\$PATH
    export CLASSPATH=.:\$JAVA_HOME/lib/dt.jar:\$JAVA_HOME/lib/tools.jar
    EOF

    source /etc/profile
  4. 部署cm-1

  • MariaDB数据库信息
角色 数据库名称 用户名 密码
Activity Monitor amon amon amon
Reports Manager rman rman rman
Hive Metastore Server metastore hive metastore
Sentry Server sentry sentry sentry
Cloudera Navigator Audit Server nav nav nav
Cloudera Navigator Metadata Server navms navms navms
Oozie oozie oozie oozie
Hue hue hue hue
  • 整理以下脚本中的变量内容, 并保存为cm.sh文件在cm-1服务器上执行.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    97
    98
    99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    110
    111
    112
    113
    114
    115
    116
    117
    118
    119
    120
    121
    122
    123
    124
    125
    126
    127
    128
    129
    130
    131
    132
    133
    134
    135
    136
    137
    138
    139
    140
    141
    142
    143
    144
    145
    146
    147
    148
    149
    150
    151
    152
    153
    154
    155
    156
    157
    158
    159
    160
    161
    162
    163
    164
    165
    166
    #!/bin/bash

    # 初始化环境信息
    maria_root_password='maria_db_root'
    scm_db_password='scm_db'

    cd /tmp || exit 1

    curl http://repo-1.cdh.lab:8889/pkg/mysql-connector-java-5.1.46.tar.gz -O
    tar zxf mysql-connector-java-5.1.46.tar.gz
    mkdir -p /usr/share/java
    mkdir -p /var/lib/oozie/
    mkdir -p /usr/lib/hive/lib/
    cp mysql-connector-java-5.1.46/mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar
    cp mysql-connector-java-5.1.46/mysql-connector-java-5.1.46-bin.jar /var/lib/oozie/
    cp mysql-connector-java-5.1.46/mysql-connector-java-5.1.46-bin.jar /usr/lib/hive/lib/
    useradd oozie -d /var/lib/oozie
    useradd hive -d /var/lib/hive
    chown oozie:oozie -R /var/lib/oozie
    chown hive:hive -R /var/lib/hive

    cat >/etc/yum.repos.d/cloudera-manager-5-13.repo <<EOF
    [cloudera-manager-5-13]
    # Packages for Cloudera Manager, Version 5, on RedHat or CentOS 7 x86_64
    name=Cloudera Manager
    baseurl=http://repo-1.cdh.lab:8889/yum/cm/5.13.3
    gpgkey =http://repo-1.cdh.lab:8889/yum/cm/RPM-GPG-KEY-cloudera
    gpgcheck = 1
    EOF

    cat > /etc/yum.repos.d/mariadb-10.0.repo << EOF
    # MariaDB 10.0 CentOS repository list - created 2018-07-22 07:46 UTC
    # http://downloads.mariadb.org/mariadb/repositories/
    [mariadb-10.0]
    name = MariaDB
    baseurl = http://repo-1.cdh.lab:8889/yum/mariadb-10.0
    gpgkey=http://repo-1.cdh.lab:8889/yum/mariadb-10.0/RPM-GPG-KEY-MariaDB
    gpgcheck=1
    EOF

    yum makecache fast

    yum install MariaDB-server-10.0.* -y

    cp -f /etc/my.cnf /etc/my.cnf.bk

    cat >/etc/my.cnf <<EOF
    [mysqld]
    datadir=/var/lib/mysql
    socket=/var/lib/mysql/mysql.sock
    transaction-isolation = READ-COMMITTED
    # Disabling symbolic-links is recommended to prevent assorted security risks;
    # to do so, uncomment this line:
    symbolic-links = 0
    # Settings user and group are ignored when systemd is used.
    # If you need to run mysqld under a different user or group,
    # customize your systemd unit file for mariadb according to the
    # instructions in http://fedoraproject.org/wiki/Systemd

    bind-address=0.0.0.0
    default-storage-engine=innodb
    sql_mode=STRICT_ALL_TABLES

    key_buffer = 16M
    key_buffer_size = 32M
    max_allowed_packet = 32M
    thread_stack = 256K
    thread_cache_size = 64
    query_cache_limit = 8M
    query_cache_size = 64M
    query_cache_type = 1

    max_connections = 550
    #expire_logs_days = 10
    #max_binlog_size = 100M

    #log_bin should be on a disk with enough free space.
    #Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
    #system and chown the specified folder to the mysql user.
    log_bin=/var/lib/mysql/mysql_binary_log

    #In later versions of MariaDB, if you enable the binary log and do not set
    #a server_id, MariaDB will not start. The server_id must be unique within
    #the replicating group.
    server_id=1

    binlog_format = mixed

    read_buffer_size = 2M
    read_rnd_buffer_size = 16M
    sort_buffer_size = 8M
    join_buffer_size = 8M

    # InnoDB settings
    innodb_file_per_table = 1
    innodb_flush_log_at_trx_commit = 2
    innodb_log_buffer_size = 64M
    innodb_buffer_pool_size = 4G
    innodb_thread_concurrency = 8
    innodb_flush_method = O_DIRECT
    innodb_log_file_size = 512M

    [mysqld_safe]
    log-error=/var/log/mariadb/mariadb.log
    pid-file=/var/run/mariadb/mariadb.pid

    #
    # include all files from the config directory
    #
    !includedir /etc/my.cnf.d
    EOF

    mkdir -p /var/log/mariadb/
    mkdir -p /var/run/mariadb
    mkdir -p /var/lib/mysql

    chown -R mysql:mysql /var/log/mariadb
    chown -R mysql:mysql /var/run/mariadb

    systemctl enable mysql
    systemctl start mysql

    /usr/bin/mysql_secure_installation <<EOF

    Y
    ${maria_root_password}
    ${maria_root_password}
    Y
    n
    Y
    Y
    EOF

    cat >/tmp/init_cdh_db.sql <<EOF
    create database amon DEFAULT CHARACTER SET utf8;
    grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';
    create database rman DEFAULT CHARACTER SET utf8;
    grant all on rman.* TO 'rman'@'%' IDENTIFIED BY 'rman';
    create database metastore DEFAULT CHARACTER SET utf8;
    grant all on metastore.* TO 'hive'@'%' IDENTIFIED BY 'metastore';
    create database sentry DEFAULT CHARACTER SET utf8;
    grant all on sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry';
    create database nav DEFAULT CHARACTER SET utf8;
    grant all on nav.* TO 'nav'@'%' IDENTIFIED BY 'nav';
    create database navms DEFAULT CHARACTER SET utf8;
    grant all on navms.* TO 'navms'@'%' IDENTIFIED BY 'navms';
    create database oozie default character set utf8;
    grant all privileges on oozie.* to 'oozie'@'localhost' identified by 'oozie';
    grant all privileges on oozie.* to 'oozie'@'%' identified by 'oozie';
    create database hue default character set utf8 default collate utf8_general_ci;
    grant all on hue.* to 'hue'@'%' identified by 'hue';
    flush privileges;
    select * from information_schema.schemata;
    EOF

    mysql -uroot -p${maria_root_password} </tmp/init_cdh_db.sql
    rm -f /tmp/init_cdh_db.sql

    yum install cloudera-manager-daemons cloudera-manager-server -y

    /usr/share/cmf/schema/scm_prepare_database.sh mysql -h 127.0.0.1 -uroot -p${maria_root_password} scm scm ${scm_db_password}
    rm -f /etc/cloudera-scm-server/db.mgmt.properties
    cat /etc/cloudera-scm-server/db.properties

    service cloudera-scm-server start
    sleep 2
  • 查看服务启动情况, 执行以下命令

    1
    tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log

    当出现以下信息后代表Cloudera Manager的web服务已经启动, 可以在浏览器中输入服务器ip:7180进行下一步的部署

    1
    2
    2018-07-18 03:13:36,899 INFO WebServerImpl:org.mortbay.log: Started     [email protected]:7180
    2018-07-18 03:13:36,899 INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.
  1. 使用Cloudera Manager部署cdh集群

在浏览器中打开http://cm-1.cdh.lab:7180

PS: 想使用域名访问cm-1的界面, 需要将对应的主机FQDN写入工作机的hosts里

默认用户密码均为admin

登录cm

条款

条款

版本对比

版本对比

组件简介

组件简介

扫描主机
PS: 由于本实验环境资源有限, 在此省去work-2

扫描主机

选择主机

选择主机

设置存储库: 设置本地Parcel库

设置存储库_01

设置存储库: 删除多余Parcel库

设置存储库_02

设置存储库: 添加本地Parcel库

设置存储库_03

设置存储库: 选择扫描出来的具体Parcel包并设置Cloudrea Manager Server的yum本地源

设置存储库_04

JDK安装选项

JDK安装选项

单用户模式: 不建议启用, 如果需要启用, 需要做前期准备.

单用户模式

提供SSH登录凭据

提供SSH登录凭据

cloudera-manager-agent安装

agent安装_01

cloudera-manager-agent安装: 安装成功

agent安装_02

部署Parcel包

部署Parcel包_01

部署Parcel包: 部署成功

部署Parcel包_02

检查主机正确性

检查主机正确性_01

检查主机正确性: 检查完毕

检查主机正确性_02

选择集群需要部署的角色服务

选择角色

划分角色到具体服务器上, 下图红框部分为自定义设置过的, 与默认值不同

划分角色

官方文档建议的划分方案(部分)

官方文档建议的划分方案

数据库设置

数据库设置

审核更改

审核更改

开始部署集群角色服务

开始部署

部署成功

部署成功

恭喜界面

恭喜界面

首页预览

首页预览

到此为止整个部署过程结束.

常见问题

  • Q: 不良 : 群集中有 710 个 副本不足的块 块。群集中共有 710 个块。百分比 副本不足的块: 100.00%。 临界阈值:40.00%。

    faq_01

  • A: 这是因为cm默认部署DataNode的数量应该是3个, 但是资源有限, 在部署集群环境DataNode数量只选择了2个. 可以用以下命令改成基于2个DataNode的模式, 用root登录到cm-1服务器上运行以下命令:
    1
    2
    su - hdfs
    hdfs dfs -setrep -w 2 -R /
显示 Gitment 评论