check for how to set up a secure network lab
List of models Elasticsearch Beats
Below is a list of Beats and their role in collecting and processing the data.
Filebeat
Reading and collecting log files, supports a large number of formats that allow it to index and transfer the logs Directly to elasticsearch
Metricbeat
use for monitor activities and usage of Elasticsearch servers
Packetbeat
collect network logs by listening to network traffic, include advance filtering and parsers
Winlogbeat
Collecting Windows events for example security events
Auditbeat
Collecting logs from Linux systems, can use also as FIM and system changes
Heartbeat
Checking system data such as CPU usage, resource size
Functionbeat
Monitoring cloud services such as AWS Lambda and send events to Elastic Stack
common Elasticsearch Infrastructure
Memory: Elasticsearch requires a significant amount of memory for proper operation, particularly for the JVM heap size. As a general rule, you should allocate at least 2GB of memory for the JVM heap, and have available at least 4GB of physical memory on the machine.CPU: Elasticsearch is generally CPU-intensive, and you should aim for a machine with at least two cores.Storage: Elasticsearch stores data in the form of indices, which can take up a significant amount of space. You should ensure that you have sufficient storage capacity to accommodate your data needs. Elasticsearch can run on local storage, but it is generally recommended to use a separate storage solution such as a network-attached storage (NAS) or a storage area network (SAN).Operating system: Elasticsearch is supported on a wide range of operating systems, including Linux, Windows, and macOS.
It’s also important to note that these are just general guidelines, and the actual system requirements may vary depending on the specific needs of your Elasticsearch deployment.
structure of Elasticsearch
The structure of Elasticsearch consists of the following main components:
- Indices: An index is a collection of documents that have a similar structure. Indices are used to store and retrieve data in Elasticsearch.
- Types: A type is a subset of an index that has a specific structure. Types are used to store and retrieve data within an index.
- Documents: A document is a basic unit of information in Elasticsearch. It consists of a set of key-value pairs, or fields, and can be indexed, searched, and retrieved.
- Mapping: Mapping is the process of defining the structure of a document in Elasticsearch. It involves specifying the fields and data types of a document, as well as other metadata such as analyzers and indexing options.
- Shards: A shard is a partition of an index that is stored on a specific node in the Elasticsearch cluster. Shards are used to distribute the data stored in an index across multiple nodes, allowing Elasticsearch to scale horizontally and handle large volumes of data.
- Replicas: A replica is a copy of a shard that is stored on a different node in the Elasticsearch cluster. Replicas are used to provide fault tolerance and improve the availability of data in the event of node failures.
Overall, the structure of Elasticsearch is designed to allow for the efficient storage, indexing, and retrieval of large volumes of data. It is highly scalable and can be used to search and analyze data from a variety of sources.
Operating Systems
Elasticsearch can be deployed on the following systems:
- Windows
- Linux
- MacOS
- deb
- rpm
- docker
Installing Elasticsearch on Ubuntu 20.04
System requirements
SERVER NAME | SYSTEM REQUIREMENTS | NETWORK REQUIREMENTS | SERVER ROLE |
---|---|---|---|
Node01 | 8GB memory + 4 cores + 100GB disk size | 9200 -Request 9300 -Cluster SSH – Management | Master |
Node02 | 8GB memory + 4 cores + 100GB disk size | 9200 -Request 9300 -Cluster SSH – Management | Data |
Node03 | 8GB memory + 4 cores + 100GB disk size | 9200 -Request 9300 -Cluster SSH – Management | ml |
List of possible roles
- master
- data
- data_content
- data_hot
- data_warm
- data_cold
- data_frozen
- ingest
- ml – Machine learning node
- remote_cluster_client
- transform
Learn more about each role here
Installing servers
Ubuntu server 20.04 operating system deployment for Node01, Node02, Node03,
Installing Ubuntu for elasticsearch
- Static IP address
- DNS address
- server and user information
enable SSH Server
Setting up a Firewall for Elasticsearch
sudo ufw enable
Opening ports to Elasticsearch Management
sudo ufw allow ssh comment “Management port”
sudo ufw allow 9200 comment “Elasticsearch requests”
sudo ufw allow 9300 comment “Elasticsearch cluster used”
Checking the status of the firewall on the servers
ufw status
Software requirements
Net tool installation
apt install net-tools
Import PGP keys
wget -qO – https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add –
Installing Transport-https
sudo apt-get install apt-transport-https
Add a new source for downloads
echo “deb https://artifacts.elastic.co/packages/7.x/apt stable main” | sudo tee /etc/apt/sources.list.d/elastic-7.x.list
Installing Elasticsearch by apt
sudo apt-get update && sudo apt-get install elasticsearch = 7.14.2
Installing Metricbeat
Installing metricbeat on each server will allow us to monitor all servers from Kibana
sudo apt-get update && sudo apt-get install metricbeat = 7.14.2
Edit settings in the metricbeat.yml file
nano /etc/metricbeat/metricbeat.yml
Because a minimum security is not yet set for Elasticsearch , you do not have to enter an elasticsearch password
setup.kibana:
host: “http://192.168.0.183:5601”
Run a model from Elasticsearch-xpack
metricbeat modules enable elasticsearch-xpack
Enable Metricbeat in the server boot
sudo systemctl enable metricbeat
Run Metricbeat
service metricbeat start
It is required to repeat all the settings above for each server we want to add to elasticsearch cluster.
Change Elasticsearch settings
There are three configuration files on each server:
elasticsearch.yml
Elasticsearch default settingsjvm.options
Java settingslog4j2.properties
Log settings for Elasticsearch
Edit elasticsearch.yml file,
cd / etc / elasticsearch
sudo nano elasticsearch.yml
Change values in elasticsearch.yml file for Node01:
#Cluster name
cluster.name: main
#Node name
node.name: node01
#Node roles
node.master: true
#Listen to all network interfaces
network.host: 0.0.0.0
#Discover all nodes on main cluster
discovery.seed_hosts: [“192.168.0.180:9300”, “192.168.0.181:9300”, “192.168.0.183:9300”]
Change values in the elasticsearch.yml file for Node02
#Cluster name
cluster.name: main
#Node name
node.name: node02
node.data: true
#Listen to all network interfaces
network.host: 0.0.0.0
discovery.seed_hosts: [“192.168.0.180:9300”, “192.168.0.181:9300”, “192.168.0.182:9300”]
Change values in the elasticsearch.yml file for Node03
#Cluster name
cluster.name: main
#Node name
node.name: node03
#Configure node for machin learning
node.ml: true
#Listen to all network interfaces
network.host: 0.0.0.0
discovery.seed_hosts: [“192.168.0.180:9300”, “192.168.0.181:9300”, “192.168.0.182:9300”]
Elasticsearch run as system service
Check if the system uses systemd or sysvinit
ps –no-headers -o comm 1
Getting Elasticsearch up and running + Adding up to 3 minutes standby time
sudo mkdir /etc/systemd/system/elasticsearch.service.d
echo -e “[Service] \ nTimeoutStartSec = 180” | sudo tee /etc/systemd/system/elasticsearch.service.d/startup-timeout.conf
sudo / bin / systemctl daemon-reload
sudo / bin / systemctl enable elasticsearch.service
Run Elasticsearch
service elasticsearch start
service elasticsearch status
Run metricbeat with the server
sudo systemctl enable metricbeat
Checking Cluster Settings Serving Node01
curl -X GET “localhost: 9200 / _cluster / health? pretty & pretty”