各位大佬请教个问题
如上图示,我用filebeat收集本地log日志到elsearch中,中间经过了logstash做了格式处理,所有服务器都是虚拟化平台里安装的独立服务器。
但目前发现效率非常低,从filebeat上看,速度还没达到1M/s的上传速度。现在不知道瓶颈在哪里?
下面是我的相关配置文件
1、filebeat的filebeat.yml配置
filebeat.inputs:
2、logstash的logstash.conf(其中一个)
input {
beats {
port => "9600"
}
}
filter {
mutate {
split => {"message" => " "}
}
mutate {
add_field => {
"day0" => "%{[message][0]}"
"time" => "%{[message][6]}"
"local_ip" => "%{[message][11]}"
"remote_ip" => "%{[message][13]}"
}
}
mutate {
convert => {
"time" => "string"
"local_ip" => "string"
"remote_ip" => "string"
}
}
date {
match => ["day0", "yyyy-MM-dd"]
add_field => ["day", "%{day0}"]
}
mutate {
remove_field => ["message","day0","host","@timestamp","@version","path","input.type","agent.type","input","type","agent","log.file.path","log","file","ecs.version","ecs","version","log.offset","offset"]
remove_field => ["log.offset","log.file.path"]
}
}
output {
elasticsearch {
hosts => [ "192.168.50.11:9200","192.168.50.12:9200","192.168.50.13:9200"]
index => "natlog-1"
}
}
3、elsearch的elasticsearch.yml配置(其中一个)
cluster.initial_master_nodes: ["es-itcast-cluster"]
node.name: node01
node.master: true
node.data: true
discovery.zen.minimum_master_nodes: 2
bootstrap.memory_lock: false
bootstrap.system_call_filter: false
network.host: 0.0.0.0
discovery.seed_hosts: ["192.168.50.11","192.168.50.12","192.168.50.13"]
http.cors.enabled: true
http.cors.allow-origin: ""
http.cors.allow-methods: "GET"
logstash那个服务器你看下配置,然后对应调整logstash里面 pipeline.workers pipeline.batch.size ,调大点。
几个可能的分析
可以先看采集的日志大小是多大,如果日志本身没超过1M/s,那估计已经是极限了
如果源文件很大,传上去慢,将各部分拆解开看下性能点
1 logstash不做转换,仅接收,看filebeat和logstash性能
2 logstash转换性能测试
3 如果1,2没有问题,那就是es的问题了
请问下,logstash如何测试性能呢?
我测试了下,filebeat直接到es,经过优化,可以达到5MB/S
filebeat 传入 logstash 再到es 速度就只有100多KB/S了
@redhat_natwork:
https://gitee.com/jiemo/logstash-performance-testing 有例子可以参考logstash的测试
logstash是ruby写的,效率可能是会低一些
现在官方的推荐是beat的模式,logstash逐步会被替换,beat的实现都是go了,效率有很大的提升
@2012: 请问下,filebeat可以做数据分割吗,我的原始数据如下
2020-09-23 09:00:00 Local7.Info 10.0.1.22 Sep 23 09:00:21 src@Master-AD-9000-H-wit : [2020-09-23_09-00-21] NAT_LOG_ADD_ENTRY [tcp]10.9.107.242:54449(2
10.42.24.134:54449) -> 110.43.89.14:80(110.43.89.14:80)
我用logstash的目的就是为了分割数据,过滤不必要的字段,然后按照新字段存储到es,不知道filebeat有没有这样的功能。
@redhat_natwork:
filebeat可以配置的
https://www.elastic.co/cn/beats/
已经支持一些格式列表 https://www.elastic.co/guide/en/beats/filebeat/current/configuration-filebeat-options.html
如果这些有差别,稍微修改下字段就可以支持你上面的格式