自慰套教室～女子全员妊娠,精品无码国产自产拍在线观看蜜桃,亚洲国产精品成人精品无码区,久别的草原在线看视频免费

<table id="gg82g"><noscript id="gg82g"></noscript></table>

微信
電話

復制成功

微信號:togogoi

添加微信好友, 詳細了解課程

已復制成功，如果自動跳轉微信失敗，請前往微信添加好友

打開微信

學習資源

思科

網絡工程

CCNA CCNA-RS| CCNA-Sec| CCNA-SP| CCNA-Collaboration| CCNA-DC| CCNA-Wireless| CCNA-Cloud

CCNP CCNP-RS| CCNP-Sec| CCNP-SP| CCNP-Collaboration| CCNP-DC| CCNP-Wireless| CCNP-Cloud

CCIE CCIE-RS| CCIE-Sec| CCIE-SP| CCIE-Collaboration| CCIE-DC| CCIE-Wireless|

華為

網絡工程

HCNA HCNA-RS| HCNA-Sec| HCNA-Cloud| HCNA-Storage| HCNA-BigData| HCNA-WLAN| HCNA-Transmission|
HCNA-UC| HCNA-VC| HCNA-CC

HCNP HCNP-RS| HCNP-Sec| HCNP-Cloud| HCNP-Storage| HCNP-BigData| HCNP-WLAN| HCNP-Transmission|
HCNP-UC| HCNP-VC| HCNP-CC

HCIE HCNP-RS| HCNP-Sec| HCNP-Cloud| HCNP-Storage| HCNP-DC| HCNP-Transmission

紅帽

系統運維

RHCSA

RHCE

RHCA

OpenStack

RHCVA

RHCSS

甲骨文

數據庫

OCA

OCP

OCM

MySQL

微軟

系統運維

MTA

MCSA

MCSE

軟件開發

編程設計

Java

Android

HTML5

UI

其他

其他

Python

學習文章

當前位置：首頁 > >學習文章 > >

{大數據}輔助系統

發布時間： 2018-02-08 02:42:18

?1.1 Flume介紹

1.1.1 概述u Flume是一個分布式、可靠、和高可用的海量日志采集、聚合和傳輸的系統。

Flume可以采集文件，socket數據包等各種形式源數據，又可以將采集到的數據輸出到HDFS、hbase、hive、kafka等眾多外部存儲系統中

一般的采集需求，通過對flume的簡單配置即可實現

Flume針對特殊場景也具備良好的自定義擴展能力，因此，flume可以適用于大部分的日常數據采集場景

1.1.2 運行機制1、 Flume分布式系統中最核心的角色是agent，flume采集系統就是由一個個agent所連接起來形成

2、每一個agent相當于一個數據傳遞員，內部有三個組件：

a) Source：采集源，用于跟數據源對接，以獲取數據

b) Sink：下沉地，采集數據的傳送目的，用于往下一級agent傳遞數據或者往最終存儲系統傳遞數據

c) Channel：angent內部的數據傳輸通道，用于從source將數據傳遞到sink

1.1.4 Flume采集系統結構圖

1. 簡單結構單個agent采集數據

?

2. 復雜結構多級agent之間串聯

?

1.2 Flume實戰案例1.2.1 Flume的安裝部署1、Flume的安裝非常簡單，只需要解壓即可，當然，前提是已有hadoop環境

上傳安裝包到數據源所在節點上

然后解壓 tar -zxvf apache-flume-1.8.0-bin.tar.gz

配置環境變量 vi /etc/profile

?HIVE_HOME=/home/hadoop/apps/hive

HBASE_HOME=/home/hadoop/apps/hbase

ZOOKEEPER_HOME=/home/hadoop/apps/zookeeper

HADOOP_HOME=/home/hadoop/apps/hadoop-2.8.1

JAVA_HOME=/opt/jdk1.8.0_121

FLUME_HOME=/home/hadoop/apps/flume

PATH=$FLUME_HOME/bin:$HIVE_HOME/bin:$HBASE_HOME/bin:$ZOOKEEPER_HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

export FLUME_HOME HIVE_HOME HBASE_HOME ZOOKEEPER_HOME HADOOP_HOME JAVA_HOME PATH USER LOGNAME MAIL HOSTNAME HISTSIZE HISTCONTROL

?

然后進入flume的目錄，修改conf下的flume-env.sh，在里面配置JAVA_HOME

export JAVA_HOME=/opt/jdk1.8.0_121

2、根據數據采集的需求配置采集方案，描述在配置文件中(文件名可任意自定義)

3、指定采集方案配置文件，在相應的節點上啟動flume agent

先用一個最簡單的例子來測試一下程序環境是否正常

1、先在flume的conf目錄下新建一個文件

vi netcat-logger.conf

?

# 定義這個agent中各組件的名字

a1.sources = r1

a1.sinks = k1

a1.channels = c1

# 描述和配置source組件：r1

a1.sources.r1.type = netcat

a1.sources.r1.bind = localhost

a1.sources.r1.port = 44444

# 描述和配置sink組件：k1

a1.sinks.k1.type = logger

# 描述和配置channel組件，此處使用是內存緩存的方式

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

# 描述和配置source channel sink之間的連接關系

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

?2、啟動agent去采集數據

?bin/flume-ng agent -c conf -f conf/netcat-logger.conf -n a1 -Dflume.root.logger=INFO,console

?

c conf 指定flume自身的配置文件所在目錄

-f conf/netcat-logger.con 指定我們所描述的采集方案

-n a1 指定我們這個agent的名字

1、測試

先要往agent采集監聽的端口上發送數據，讓agent有數據可采

隨便在一個能跟agent節點聯網的機器上

telnet anget-hostname port （telnet localhost 44444）

如果telnet命令找不到，則用以下方式安裝

[root@hdp08 ~]# yum install telnet

?

1.2.2 采集案例

1、采集指定目錄下的日志文件

# Name the components on this agent

a1.sources = r1

a1.sinks = k1

a1.channels = c1

# Describe/configure the source

#監聽目錄,spoolDir指定目錄, fileHeader要不要給文件夾前墜名

a1.sources.r1.type = spooldir

a1.sources.r1.spoolDir = /home/hadoop/flumespool

a1.sources.r1.fileHeader = true

# Describe the sink

a1.sinks.k1.type = logger

# Use a channel which buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

??

啟動命令：

bin/flume-ng agent -c ./conf -f ./conf/spool-logger.conf -n a1 -Dflume.root.logger=INFO,console

?

2、采集目錄到HDFS采集需求：某服務器的某特定目錄下，會不斷產生新的文件，每當有新文件出現，就需要把文件采集到HDFS中去

根據需求，首先定義以下3大要素

l 采集源，即source——監控文件目錄 : spooldir

l 下沉目標，即sink——HDFS文件系統 : hdfs sink

l source和sink之間的傳遞通道——channel，可用file channel 也可以用內存channel

配置文件編寫：

?

# Name the components on this agent

a1.sources = r1

a1.sinks = k1

a1.channels = c1

# Describe/configure the source

#監聽目錄,spoolDir指定目錄, fileHeader要不要給文件夾前墜名

a1.sources.r1.type = spooldir

a1.sources.r1.spoolDir = /home/hadoop/flumespool

a1.sources.r1.fileHeader = true

# Describe the sink

a1.sinks.k1.type = hdfs

a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/

a1.sinks.k1.hdfs.filePrefix = events-

a1.sinks.k1.hdfs.round = true

a1.sinks.k1.hdfs.roundValue = 10

a1.sinks.k1.hdfs.roundUnit = minute

a1.sinks.k1.hdfs.rollInterval = 3

a1.sinks.k1.hdfs.rollSize = 20

a1.sinks.k1.hdfs.rollCount = 5

a1.sinks.k1.hdfs.batchSize = 1

a1.sinks.k1.hdfs.useLocalTimeStamp = true

#生成的文件類型，默認是Sequencefile，可用DataStream，則為普通文本

a1.sinks.k1.hdfs.fileType = DataStream

# Use a channel which buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

?

Channel參數解釋：

capacity：默認該通道中較大的可以存儲的event數量

trasactionCapacity：每次較大可以從source中拿到或者送到sink中的event數量

keep-alive：event添加到通道中或者移出的允許時間

?

3、采集文件到HDFS采集需求：比如業務系統使用log4j生成的日志，日志內容不斷增加，需要把追加到日志文件中的數據實時采集到hdfs

根據需求，首先定義以下3大要素

l 采集源，即source——監控文件內容更新 : exec ‘tail -F file’

l 下沉目標，即sink——HDFS文件系統 : hdfs sink

l Source和sink之間的傳遞通道——channel，可用file channel 也可以用內存channel

配置文件編寫：

?

# Name the components on this agent

a1.sources = r1

a1.sinks = k1

a1.channels = c1

# Describe/configure the source

a1.sources.r1.type = exec

a1.sources.r1.command = tail -F /home/hadoop/log/test.log

a1.sources.r1.channels = c1

# Describe the sink

a1.sinks.k1.type = hdfs

a1.sinks.k1.channel = c1

a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/

a1.sinks.k1.hdfs.filePrefix = events-

a1.sinks.k1.hdfs.round = true

a1.sinks.k1.hdfs.roundValue = 10

a1.sinks.k1.hdfs.roundUnit = minute

a1.sinks.k1.hdfs.rollInterval = 3

a1.sinks.k1.hdfs.rollSize = 20

a1.sinks.k1.hdfs.rollCount = 5

a1.sinks.k1.hdfs.batchSize = 1

a1.sinks.k1.hdfs.useLocalTimeStamp = true

#生成的文件類型，默認是Sequencefile，可用DataStream，則為普通文本

a1.sinks.k1.hdfs.fileType = DataStream

# Use a channel which buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

啟動命令：

bin/flume-ng agent -c conf -f conf/tail-hdfs.conf -n a1 -Dflume.root.logger=INFO,console

?

4、采集文件發到另一個agent

?

從tail命令獲取數據發送到avro端口

另一個節點可配置一個avro源來中繼數據，發送外部存儲

##################

# Name the components on this agent

a1.sources = r1

a1.sinks = k1

a1.channels = c1

# Describe/configure the source

a1.sources.r1.type = spooldir

a1.sources.r1.spoolDir = /home/hadoop/flumespool

a1.sources.r1.fileHeader = true

# Describe the sink

a1.sinks = k1

a1.sinks.k1.type = avro

a1.sinks.k1.channel = c1

a1.sinks.k1.hostname = hdp09

a1.sinks.k1.port = 4141

a1.sinks.k1.batch-size = 2

# Use a channel which buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

?

從avro端口接收數據，下沉到logger

采集配置文件，avro-hdfs.conf

# Name the components on this agent

a1.sources = r1

a1.sinks = k1

a1.channels = c1

# Describe/configure the source

a1.sources.r1.type = avro

a1.sources.r1.channels = c1

a1.sources.r1.bind = 0.0.0.0

a1.sources.r1.port = 4141

# Describe the sink

a1.sinks.k1.type = logger

# Use a channel which buffers events in memory

a1.channels.c1.type = memory

a1.channels.c1.capacity = 1000

a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel

a1.sources.r1.channels = c1

a1.sinks.k1.channel = c1

發送數據：

$ bin/flume-ng avro-client -H localhost -p 4141 -F /usr/logs/log.10

?

1.3 更多source和sink組件

?

Flume支持眾多的source和sink類型，詳細手冊可參考官方文檔

http://flume.apache.org/FlumeUserGuide.html?

QQ空間新浪微博騰訊微博人人網微信更多

上一篇： {大數據}sqoop數據遷移

下一篇： {大數據}HBase開發

十五年老品牌

微信咨詢：togogoi 咨詢電話：18922156670 咨詢網站客服：在線客服

相關課程推薦

客服熱線

18922156670

微信咨詢：togogoi

全國校區

廣州總校區：廣州市天河區科韻路棠安路188號樂天大廈2樓整層
深圳分校區：深圳市南山區南油第四工業區2棟602室
其他城市校區為流動地址,請聯系網站客服獲取校區地址

關注我們

Copyright © 2018-2023 廣州騰科網絡技術有限公司 All rights reserved 粵ICP備12042194號

點擊QQ咨詢
聯系電話：18922156670
在線咨詢

在線咨詢 ×

您好，請問有什么可以幫您？我們將竭誠提供最優質服務！

QQ咨詢下次再說

<蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <蜘蛛词>| <文本链> <文本链> <文本链> <文本链> <文本链> <文本链>