The Open Source data and event driven Platform

Self-hosted (on-premise or cloud) Log Management Platform

Platform Features

 

Open Source – Big Data – Real time (Kafka®) – Advanced ETL (SkaLogs ETL) – {Indexing, Storage, Archiving} (Elasticsearch®, HDFS) – Self-hosted (On-premise, Cloud) – {private, public, hybrid, multi}-Cloud – Multi-{cluster, instance, tenant} Environments – Virtualization (OpenStack, KVM) – Containerization (Docker) – Orchestration (Kubernetes) – Managed Infrastructure (Rancher) – Scripted (AnsibleShell, yaml) – Automated Deployment (Ansible) – Secured (SSL, Elastic Security, IPsec) – Managed Updates – Directory Connectors (LDAP, Kerberos, AD) –  Self-Monitored – Error Retry Mechanism – Storage Connectors – Metrics and Computations (Statistical, Advanced functions, Machine Learning) – Visualization Templates (GrafanaKibana) – Notifications (thresholds, SMS, email, Slack…)

SkaLogs simplifies your Log journey

SkaLogs is a self-hosted entreprise grade Open Source – Big Data – Real-Time Platform designed to help you deploy, manage, and scale a centralized log management solution. It is based on 3 core components (Apache V2):

  1. SkaETL : an Open Source (SkaETL GitHub Repo) real-time ETL (extract-transform-load) developed by SkaLogs,
  2. ELK Stack : Elasticsearch, Logstash, Kibana,
  3. Kafka.

Thanks to its advanced ETL (SkaETL), the deployed SkaLogs instance can be turned into several use-cases (Solutions):

The SkaLogs Platform consists of a bundle which deploys many services, and scales them according to the ressources (cloud, on-premise) allocated to the instance deployed :

SkaLogs Bundle (GitHub repo):

The entire platform is deployed via a single Ansible script which:

  • installs a bundle consisting of the above-mentioned 3 core components (SkaETL, ELK, Kafka),
  • adds multiple side components,
  • assembles the pieces into a scalable, automated, resilient, self-monitored, and complete end-to-end Log Management Platform,
  • provides an entirely managed infrastructure (Rancher) with containerized (Docker) and orchestrated (Kubernetes) components.

The SkaLogs bundle includes:

  • Rancher as a container management platform,
  • SkaETL as an advanced log-dedicated ETL (SkaETL, developed by SkaLogs) with multiple guided workflows to help you with all the difficult tasks:
    • Logs: collect, transform, normalize, parse, aggregate,
    • Metrics: compute (before ES ingestion), store, search, investigate,
    • Alerts: create thresholds with alerts and notifications.
    • Visualize:
      • before ES ingestion: monitor data before ingestion and indexing in Elasticsearch
      • after ES ingestion:
        • Kibana as a front-end to Elasticsearch
        • Grafana as a front end for technical monitoring

Core Features:

  • Self-hosted (on-premise or cloud) complete end-to-end centralized Log Management Platform
  • Scripted and Automated deployment (Ansible, Shell and yaml scripts)
  • Container management (Rancher container management platform) and Orchestration (Kubernetes)
  • Guided workflows for
    • data and Log ingestion and transformation (structured and unstructured)
    • real-time metrics computations and insights
    • interfacing with your own ML algorithms

 SkaETL Features:

SkaETL is a specialized 100% log-dedicated ETL developed by SkaLogs, allolwing you to process structured or unstructured data. The difficult task of data transformation is completely simplified thanks to multiple guided workflows:

  • Ingest, parse, transform, enrich, normalize, aggregate, index, archive
  • Compute (simple statistics, complex functions, and ML algorithms)
  • Search and investigate
  • Visualize and monitor
  • Alerts and notifications

Technical Features:

  • Microservices based architecture
  • Packages multiple Open Source libraries and framework
  • Entirely managed infrastructure with containerized and orchestrated components
  • Automated (scripted) base deployment
    • 50+ services
    • 150+ docker containers
  • Resilient
    • Error retry mechanism,
    • Data buffer guaranteed by Kafka,
    • Infrastructure resilience via Rancher,
    • Self-monitored
  • Scalable 
    • Volume: scales without limits
    • Speed: ingest at +100 K EPS (events/second or json documents/second)

INFRASTRUCTURE

Adapts to your requirement and infrastructure constraints: On-Prem, {private,public,hybrid,multi}-cloud architecture and multi-{cluster,instance,tenant} environments

SCALABILITY - RESILIENCE

Built-in scaling and resilience of each critical component (Kafka, ETL, ES) using state-of-the-art containerized orchestration (k8s or cattle). Container management of every component (rancher)

DATA STREAMING - BUFFERING

Stream Logs and Events from multiple sources in real time . Data ingestion and buffering managed by a Kafka cluster, minimizing the risks of data loss

SKAETL

Unique SkaLogs ETL 100% Log dedicated: Log transformation engine greatly reduces your time to production via guided workflows. Enhances your ability to import any type of logs, transform, parse, compute, visualize in real time

METRICS – COMPUTATIONS

Guided SkaETL workflows and templates to peform calculations on your parsed data to create metrics and KPIs. Offloads all the heavy computations away from ES. Pre-defined metrics templates, and complex calculation via proprietary SkaLang.

ALERTS - NOTIFICATIONS

Define customized thresholds based on computed metrics. Create alerts and notifications based on thresholds and events

ELASTICSEARCH

We manage the Elasticsearch cluster, and optimize the data replication and partitioning by optimizing shards across your ES instances

CAPACITY: VOLUME & VELOCITY

Very high capacity in terms of daily ingested volume, total volume, and velocity. Daily: up to 10 TB/day. Total volume up to 4 PB of raw data without replication. Speed: +100 K EPS (events per second or json documents per second)

VISUALIZATION - DASHBOARD

Predefined and customizable Dashboard and Templates, with Charts and Indicators (Infrastructure and Applications). Technical metrics, KPIs.

SkaETL Features

skaetl features

Pronounced “skettle”, SkaETL is a unique real time Open Source ETL designed for and dedicated to Log processing and transformation. It is an innovative approach to data ingestion and transformation, with computing, monitoring, and alerting capabilities based on user defined thresholds. SkaETL parses and enhances data from Kafka topics to any output: {Kafka enhanced topics, Elasticsearch, more to come}.

SkaETL provides guided workflows simplifying the complex task of importing any kind of machine data. Sample workflows: data ingestion pipelines, grok parsing simulations, metric computations, referential creation, Kafka live stream monitors.

Kafka Input – Real-Time – Error Retry – Guided Workflows – Ingestion Pipelines – Grok Parsing Simulations – Logstash Configurator – Referentials Creation – CMDB Referential – Monitoring – Alerting – Computations (before ES) – AI/ML – SkaLogs Simple Language (Query, Computations, Correlations) – Visualization (before ES, Kibana) – Output to Kafka (enhanced topics) and ES – Notifications

realtime

REAL TIME PROCESSING

Real time ingestion, transformation, parsing of Logs. Parses and enhances data from Kafka topics to any output: Kafka (enhanced topics), Elasticsearch, email, Slack. Real time computations, monitoring, and alerts.

Log types

LOG TYPES

Collect and manage any Log type from Infrastructure, Network, Security, AI, Applications (Business, Functional), Business (Events)

Log Origin

LOG ORIGIN

OS (Operating Systems), DB (Databases), Servers (Web, Mail, etc…). Storage, Virtualization, Containers, Orchestration, Cloud Services, APIs, Network, Antivirus, Firewall, IDS (Intrusion Detection Systems), WAF (Web Application Firewall), Devices, IoT, Scada systems

Actions

ACTIONS

Collect, Centralize, Ingest, Transform, Normalize, Aggregate, Parse, Compute, Store, Archive, Detect, Monitor, Alert, Visualize

Workflow

GUIDED WORKFLOWS

Guided workflows to simplify all the difficult tasks: Ingestion Pipeline (collect, transform, normalize, parse, aggregate), Grok pattern simulation, Generate complex Logstash conf. generator, Metrics computation (simple, complex, conditional), Referential creation, Kafka Live Stream.

error handling

ERROR HANDLING

Error retry mechanism for log ingestion and parsing enabling you to recover from any downtime and prevent data loss. Process several streams of data simultaneously (retry queues, live queues) without having to manage them

GROK PATTERN SIMULATION

Guided workflow for Log parsing via grok patterns. Simulate the result of grok patterns on ingested Logs and validate the Log transformation and normalization process. Large set of pre-defined grok patterns

logstash

LOGSTASH CONFIGURATION

Generate complex Logstash configurations via guided workflow. Once your ingestion and transformation workflow is complete, with a simple button, click you can generate any Logstash conf. file, no matter how complex the Log transformation

link

REFERENTIALS

Build data referential on the fly based on events processed by SkaETL. Guided workflow to create referentials for further re-use. Allows to fine-tune analysis and avoid re-processing.

CMDB REFERENTIAL

Generate a CMDB referential to have a snapshot view of your entire IT park (machines, devices, objects, servers, services, virtual machines, containers)

Calculate

COMPUTATIONS (TEMPLATES)

Guided workflow to perform simple metric computations. Standard statistical functions, and complex custom formulas enabled via SkaLogs simple Language. You may do extensive computations and define your own custom functions, no matter the complexity

realtime

REAL TIME VISUALIZATION

Real time monitoring and exploration of all processes managed by the ETL: ingestion, transformation, computations, error retry, referentials, Kafka live streams

Kafka live stream

KAFKA LIVE STREAM MONITOR

Monitor Log ingestion within Kafka via the ETL. Metrics to monitor message flow, input and output. Monitor individually every topic created, and aggregate statistics

Skalogs languages

SKALOGS LANGUAGE

Easy to use sql-like SkaLogs Language to perform complex queries and computations via guided workflows. Perform complex event correlations (ie SIEM) via customizable templates for cross domain security analysis

alert

MONITORING - ALERTS

Real time monitoring, alerting, and notifications based on events and user defined thresholds. Define at least one output from your ingestion process, and create multiple outputs to email, Slack, snmp, system_out

Solutions

SkaLogs covers all your enterprise machine data monitoring and analytics needs: Centralized Log Management Platform (LM), Business Activity Monitoring (BAM), IT Monitoring (ITOA), or Security Management (SIEM)

Get in Touch

Are you Interested in trying SkaLogs ? Do you need help with deploying SkaLogs ? Would you like to contribute to the project ? Or just curious....