136 lines
5.3 KiB
Markdown
136 lines
5.3 KiB
Markdown
---
|
|
gitea: none
|
|
include_toc: true
|
|
---
|
|
|
|
# PerfSPEC Learning Phase PerfSPEC Learning Phasee
|
|
|
|
Based in [PerfSPEC: Performance Profiling-based Proactive Security Policy Enforcement for Containers](https://ieeexplore.ieee.org/document/10577533) document presented in [1], this repository contains source files used to generate and process data.
|
|
|
|
- Main Reference: [PerfSPEC reference document](PerfSPEC.pdf) as [White paper](https://en.wikipedia.org/wiki/White_paper)
|
|
- [Presentación in Spanish](presentacion.pdf)
|
|
- [How to install](https://repo.jesusperez.pro/jesus/perfspec-learning/src/branch/main/install.md) covers basic enviroment,tools, and recommendations.
|
|
|
|
<div style="margin: auto">
|
|
<a target="_blank" href="perfspec-learning/src/branch/main/presentacion.pdf"><img src="imgs/perfSPEC-learning.png" width="800"></a>
|
|
</div>
|
|
|
|
__PerfSPEC__
|
|
|
|
>[!IMPORTANT]
|
|
With `PerfSPEC` [Security Policies](https://en.wikipedia.org/wiki/Security_policy) can be managed / watched in **Proactive** mode by using <u>ranking</u>, <u>learning</u> and <u>profiles</u> for safetiness, performance and resource costs
|
|
|
|
It has three phases:
|
|
|
|
- Ranking
|
|
- Learning
|
|
- Runtime
|
|
|
|
This repository is focused in __Learning__ phase with attention on:
|
|
|
|
- Event logs, info load and process
|
|
- Predictive learning model
|
|
|
|
There are additional documents to this:
|
|
|
|
- [Quick start](installation.md) and installation
|
|
- [Intro](intro.md) about why and what is done
|
|
- [About](about.md) goals and experiences
|
|
- [Presentation in Spanish](presentacion.pdf) slides to explain process and enviroment
|
|
|
|
> [!NOTE]
|
|
> It is considered that __event data collection__ in `raw-audit-logs.log.xz` are realistic and representative to simulate
|
|
administrative operations.
|
|
|
|
## Files
|
|
|
|
### Data
|
|
|
|
- `raw-audit-logs.log` contains raw Kubernetes audit logs collected using the `audit-policy.yaml` audit policy.
|
|
|
|
### Layout
|
|
|
|
Tools are distributed in directories:
|
|
|
|
- [Collect](collect)
|
|
- [Process](process)
|
|
- [Learning](learning)
|
|
|
|
Content structure overview with notes
|
|
<pre>
|
|
├── PerfSPEC.pdf Reference document
|
|
├── README.md
|
|
├── about.md
|
|
├── actions_distribution.pdf Generated actions distribytion
|
|
├── collect Collect logs scripts
|
|
├── data Extracted from compress archive
|
|
├── data_sample.tar.xz Compress archive with 'data'
|
|
├── imgs
|
|
├── install.md Installation notes
|
|
├── intro.md
|
|
├── learning
|
|
├── models Extracted from compress archive
|
|
├── models_sample.tar.xz Comperss archive with 'models'
|
|
├── presentacion.pdf Presentation slides
|
|
└── raw-audit-logs.log.xz Main Raw Logs file
|
|
</pre>
|
|
|
|
A [full directory layout](full_content_layout.md) is available.
|
|
|
|
As some tasks can be used in [Python](https://python.org) or [Rust](https://www.rust-lang.org/) there are or will be directories for each programming languge inside directories tasks.
|
|
|
|
Each `task/programming-language` use a common __data__ directory where processing output files is generated.
|
|
|
|
## Collect data
|
|
|
|
If you wish to [collect](collect) your own dataset, there are several source files that might help:
|
|
|
|
- `collect/audit-policy.yaml` is for [Kubernetes](https://kubernetes.io/) event logs capture, other resources are also required: [adminssion controllers](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/), etc
|
|
- `collect/collect.py` is a script to trigger the installation and uninstallation of public Helm repositories.
|
|
- `collect/helm-charts.json` is a backup of Helm charts used at the time of the collection.
|
|
|
|
## Process data
|
|
|
|
`data/raw-audit-logs.log` Raw logs captured from Services
|
|
`data/main-audit-logs.log` Data logs fixed and clean
|
|
`data/actions-dataset-audit.txt` Source content for learning models
|
|
|
|
`data/actions_distribution.pdf` Autogenerated graph view of actions and events distribution
|
|
|
|
## Data Models
|
|
|
|
> [!CAUTION]
|
|
> These files are default names and paths, can be changed:
|
|
> - by [settings](learning/python/lib_perfspec.py) modifications
|
|
> - by <u>command-line</u> in running script mode. Add **--help** for more info
|
|
|
|
`models/checkpoints` is where files are stored as part of learning process:
|
|
|
|
<pre>
|
|
├── checkpoints
|
|
│ ── model_at_epoch_175.keras
|
|
└── model_at_epoch_185.keras
|
|
</pre>
|
|
|
|
`models/perfSPEC_model.keras` is the generated model by default
|
|
`models/history.json` is model history with stats
|
|
|
|
## Learning Notebooks
|
|
|
|
[lib_perfspec.py](learning/python/lib_perfspec.py) Main library with settings
|
|
|
|
[prepare_perfspec.py](learning/python/prepare_perfspec.py) Prepare data from raw to source for learning models
|
|
|
|
[train_perfspec.py](learning/python/train_perfspec.py) To train model from data
|
|
|
|
[run_perfspec.py](learning/python/run_perfspec.py) To run/check predictions
|
|
|
|
[model_perfspec.py](learning/python/model_perfspec.py) To inspect / review generated models
|
|
|
|
<small> __ pycache __ is for Python execution, is ignored in git tasks.</small>
|
|
|
|
|
|
## Reference
|
|
|
|
[1]: [H. Kermabon-Bobinnec et al., "PerfSPEC: Performance Profiling-based Proactive Security Policy Enforcement for Containers," in IEEE Transactions on Dependable and Secure Computing, doi: 10.1109/TDSC.2024.3420712.](https://ieeexplore.ieee.org/document/10577533)
|