Important

With PerfSPEC Security Policies can be managed / watched in Proactive mode by using ranking, learning and profiles for safetiness, performance and resource costs

It has three phases:

Ranking
Learning
Runtime

This repository is focused in Learning phase with attention on:

Event logs, info load and process
Predictive learning model

There are additional documents to this:

Quick start and installation
Intro about why and what is done
About goals and experiences
Presentation in Spanish slides to explain process and enviroment

Note

It is considered that event data collection in raw-audit-logs.log.xz are realistic and representative to simulate administrative operations.

Files

Data

raw-audit-logs.log contains raw Kubernetes audit logs collected using the audit-policy.yaml audit policy.

Layout

Tools are distributed in directories:

Content structure overview with notes

    ├── PerfSPEC.pdf                   Reference document
    ├── README.md
    ├── about.md
    ├── actions_distribution.pdf       Generated actions distribytion
    ├── collect                        Collect logs scripts 
    ├── data                           Extracted from compress archive 
    ├── data_sample.tar.xz             Compress archive with 'data'
    ├── imgs
    ├── full_content_layout.md         Full content layout
    ├── html                           HTML download for notebooks
    ├── install.md                     Installation notes
    ├── intro.md
    ├── learning
    ├── models                         Extracted from compress archive
    ├── models_sample.tar.xz           Comperss archive with 'models'
    ├── presentacion.pdf               Presentation slides
    └── raw-audit-logs.log.xz          Main Raw Logs file

A full directory layout is available.

As some tasks can be used in Python or Rust there are or will be directories for each programming languge inside directories tasks.

Each task/programming-language use a common data directory where processing output files is generated.

Collect data

If you wish to collect your own dataset, there are several source files that might help:

collect/audit-policy.yaml is for Kubernetes event logs capture, other resources are also required: adminssion controllers, etc
collect/collect.py is a script to trigger the installation and uninstallation of public Helm repositories.
collect/helm-charts.json is a backup of Helm charts used at the time of the collection.

Process data

data/raw-audit-logs.log Raw logs captured from Services data/main-audit-logs.log Data logs fixed and clean data/actions-dataset-audit.txt Source content for learning models

data/actions_distribution.pdf Autogenerated graph view of actions and events distribution

Data Models

Caution

These files are default names and paths, can be changed:

by settings modifications

by command-line in running script mode. Add --help for more info

models/checkpoints is where files are stored as part of learning process:

├── checkpoints
    │   ── model_at_epoch_175.keras
    └── model_at_epoch_185.keras

models/perfSPEC_model.keras is the generated model by default models/history.json is model history with stats

Learning Notebooks

lib_perfspec.py Main library with settings

prepare_perfspec.py Prepare data from raw to source for learning models

train_perfspec.py To train model from data

run_perfspec.py To run/check predictions

model_perfspec.py To inspect / review generated models

__ pycache __ is for Python execution, is ignored in git tasks.

HTML Notebooks

Notebooks downloaded as HTML with code (no data is includes in this mode only output):

prepare_perfspec.html Prepare data from raw to source for learning models

model_perfspec.html To inspect / review generated models

Reference

[1]: H. Kermabon-Bobinnec et al., "PerfSPEC: Performance Profiling-based Proactive Security Policy Enforcement for Containers," in IEEE Transactions on Dependable and Secure Computing, doi: 10.1109/TDSC.2024.3420712.