The Open Source data and event driven Platform
Self-hosted (on-premise or cloud) Log Management Platform
Platform Features
SkaLogs simplifies your Log journey
Self-Hosted (On-premise, Cloud) – Open Source – Big Data – Real time (Kafka®) – Advanced ETL (SkaLogs ETL) – {Indexing, Storage, Archiving} (Elasticsearch®, HDFS) – {private, public, hybrid, multi}-Cloud – Multi-{cluster, instance, tenant} Environments – Virtualization (OpenStack, KVM) – Containerization (Docker) – Orchestration (Kubernetes) – Managed Infrastructure (Rancher) – Scripted (Ansible, Shell, yaml) – Automated Deployment (Ansible) – Secured (SSL, Elastic Security, IPsec) – Managed Updates – Directory Connectors (LDAP, Kerberos, AD) – Self-Monitored – Error Retry Mechanism – Storage Connectors – Metrics and Computations (Statistical, Advanced functions, Machine Learning) – Visualization Templates (Grafana, Kibana) – Notifications (thresholds, SMS, email, Slack…)
Self-Hosted entreprise grade Open Source - Big Data - Real-Time Platform
Core Components
SkaLogs is designed to help you deploy, manage, and scale a centralized log management solution. It is based on 3 core components (Apache V2):
- SkaETL
- an Open Source real-time Java ETL (extract-transform-load),
- developed by SkaLogs,
- see SkaETL GitHub Repo.
- ELK Stack
- Kafka.
Solutions
Thanks to its advanced ETL (SkaETL), the deployed SkaLogs instance can be turned into several use-cases (Solutions):
- Centralized Log Management Platform
- IT Operation Analytics (ITOA)
- Business Activity Monitoring (BAM)
- Security Information and Event Management (SIEM)
Simplified Deployment
The SkaLogs Platform consists of a bundle – SkaLogs Bundle (GitHub repo) – which deploys many services, and scales them according to the allocated self-hosted ressources (cloud, on-premise). The entire platform is deployed via a few Ansible scripts which:
- configure VMs (or bare metal servers)
- install a bundle consisting of the above-mentioned 3 core components (SkaETL, ELK, Kafka),
- add multiple side components (Zookeper, Prometheus, Grafana…)
- assemble the pieces into a scalable, automated, resilient, self-monitored, and complete end-to-end Log Management Platform,
- provide an entirely managed infrastructure (Rancher) with containerized (Docker) and orchestrated (Kubernetes) component
SkaLogs Functional Architecture
(simplified, without self-monitoring)
Platform Features (detailed)
INFRASTRUCTURE
Adapts to your requirement and infrastructure constraints: On-Prem, {private,public,hybrid,multi}-cloud architecture and multi-{cluster,instance,tenant} environments
SCALABILITY - RESILIENCE
Built-in scaling and resilience of each critical component (Kafka, ETL, ES) using state-of-the-art containerized orchestration (k8s or cattle). Container management of every component (rancher)
DATA STREAMING - BUFFERING
Stream Logs and Events from multiple sources in real time . Data ingestion and buffering managed by a Kafka cluster, minimizing the risks of data loss
SKAETL
Unique SkaLogs ETL 100% Log dedicated: Log transformation engine greatly reduces your time to production via guided workflows. Enhances your ability to import any type of logs, transform, parse, compute, visualize in real time
METRICS – COMPUTATIONS
Guided SkaETL workflows and templates to peform calculations on your parsed data to create metrics and KPIs. Offloads all the heavy computations away from ES. Pre-defined metrics templates, and complex calculation via proprietary SkaLang.
ALERTS - NOTIFICATIONS
Define customized thresholds based on computed metrics. Create alerts and notifications based on thresholds and events
ELASTICSEARCH
We manage the Elasticsearch cluster, and optimize the data replication and partitioning by optimizing shards across your ES instances
CAPACITY: VOLUME & VELOCITY
Very high capacity in terms of daily ingested volume, total volume, and velocity. Daily: up to 10 TB/day. Total volume up to 4 PB of raw data without replication. Speed: +100 K EPS (events per second or json documents per second)
VISUALIZATION - DASHBOARD
Predefined and customizable Dashboard and Templates, with Charts and Indicators (Infrastructure and Applications). Technical metrics, KPIs.
SkaETL Features
Pronounced “skettle”, SkaETL is a unique real time Open Source ETL designed for and dedicated to Log processing and transformation. It is an innovative approach to data ingestion and transformation, with computing, monitoring, and alerting capabilities based on user defined thresholds. SkaETL parses and enhances data from Kafka topics to any output: {Kafka enhanced topics, Elasticsearch, more to come}. SkaETL provides guided workflows simplifying the complex task of importing any kind of machine data. Sample workflows: data ingestion pipelines, grok parsing simulations, metric computations, referential creation, Kafka live stream monitors.
Kafka Input – Real-Time – Error Retry – Guided Workflows – Ingestion Pipelines – Grok Parsing Simulations – Logstash Configurator – Referentials Creation – CMDB Referential – Monitoring – Alerting – Computations (before ES) – AI/ML – SkaLogs Simple Language (Query, Computations, Correlations) – Visualization (before ES, Kibana) – Output to Kafka (enhanced topics) and ES – Notifications
REAL TIME PROCESSING
Real time ingestion, transformation, parsing of Logs. Parses and enhances data from Kafka topics to any output: Kafka (enhanced topics), Elasticsearch, email, Slack. Real time computations, monitoring, and alerts.
LOG TYPES
Collect and manage any Log type from Infrastructure, Network, Security, AI, Applications (Business, Functional), Business (Events)
LOG ORIGIN
OS (Operating Systems), DB (Databases), Servers (Web, Mail, etc…). Storage, Virtualization, Containers, Orchestration, Cloud Services, APIs, Network, Antivirus, Firewall, IDS (Intrusion Detection Systems), WAF (Web Application Firewall), Devices, IoT, Scada systems
ACTIONS
Collect, Centralize, Ingest, Transform, Normalize, Aggregate, Parse, Compute, Store, Archive, Detect, Monitor, Alert, Visualize
GUIDED WORKFLOWS
Guided workflows to simplify all the difficult tasks: Ingestion Pipeline (collect, transform, normalize, parse, aggregate), Grok pattern simulation, Generate complex Logstash conf. generator, Metrics computation (simple, complex, conditional), Referential creation, Kafka Live Stream.
ERROR HANDLING
Error retry mechanism for log ingestion and parsing enabling you to recover from any downtime and prevent data loss. Process several streams of data simultaneously (retry queues, live queues) without having to manage them
GROK PATTERN SIMULATION
Guided workflow for Log parsing via grok patterns. Simulate the result of grok patterns on ingested Logs and validate the Log transformation and normalization process. Large set of pre-defined grok patterns
LOGSTASH CONFIGURATION
Generate complex Logstash configurations via guided workflow. Once your ingestion and transformation workflow is complete, with a simple button, click you can generate any Logstash conf. file, no matter how complex the Log transformation
REFERENTIALS
Build data referential on the fly based on events processed by SkaETL. Guided workflow to create referentials for further re-use. Allows to fine-tune analysis and avoid re-processing.
CMDB REFERENTIAL
Generate a CMDB referential to have a snapshot view of your entire IT park (machines, devices, objects, servers, services, virtual machines, containers)
COMPUTATIONS (TEMPLATES)
Guided workflow to perform simple metric computations. Standard statistical functions, and complex custom formulas enabled via SkaLogs simple Language. You may do extensive computations and define your own custom functions, no matter the complexity
REAL TIME VISUALIZATION
Real time monitoring and exploration of all processes managed by the ETL: ingestion, transformation, computations, error retry, referentials, Kafka live streams
KAFKA LIVE STREAM MONITOR
Monitor Log ingestion within Kafka via the ETL. Metrics to monitor message flow, input and output. Monitor individually every topic created, and aggregate statistics
SKALOGS LANGUAGE
Easy to use sql-like SkaLogs Language to perform complex queries and computations via guided workflows. Perform complex event correlations (ie SIEM) via customizable templates for cross domain security analysis
MONITORING - ALERTS
Real time monitoring, alerting, and notifications based on events and user defined thresholds. Define at least one output from your ingestion process, and create multiple outputs to email, Slack, snmp, system_out
Solutions
SkaLogs covers all your enterprise machine data monitoring and analytics needs: Centralized Log Management Platform (LM), Business Activity Monitoring (BAM), IT Monitoring (ITOA), or Security Management (SIEM)
Get in Touch
Are you Interested in trying SkaLogs ? Do you need help with deploying SkaLogs ? Would you like to contribute to the project ? Or just curious....