chore: complete items

This commit is contained in:
Jesús Pérez Lorenzo 2025-01-27 09:49:29 +00:00
parent 5a532c5f96
commit b81a0acbfc

View File

@ -3,7 +3,7 @@ gitea: none
include_toc: true
---
# PerfSPEC Learning Phase
# PerfSPEC Learning Phas# PerfSPEC Learning Phasee
Based in [PrefSPEC: Performance Profiling-based Proactive Security Policy Enforcement for Containers](https://ieeexplore.ieee.org/document/10577533) document presented in [1], thir repository contains source files used to generate and process data.
@ -51,47 +51,26 @@ Tools are distributed in directories:
- [Process](process)
- [Learning](learning)
<details open>
<summary>Files layout</summary>
Content structure overview with notes
<pre>
Content structure overview with notes
<pre>
├── PerfSPEC.pdf Reference document
├── README.md
├── about.md
├── actions_distribution.pdf Generated actions distribytion
├── collect Collect logs scripts
│   ├── audit-policy.yaml
│   ├── collect.py
│   └── helm-charts.json
├── data Extracted from compress archive
│   ├── actions-dataset-audit.txt
│   ├── actions-logs.log
│   ├── actions_distribution.pdf
│   ├── main-audit-logs.log
│   └── raw-audit-logs.log
├── data_sample.tar.xz Compress archive with 'data'
├── imgs
├── install.md Installation notes
├── intro.md
├── learning
│   └── python
│   ├── __pycache__ Ignored in git
│   ├── lib_perfspec.py
│   ├── model_perfspec.py
│   ├── prepare_perfspec.py
│   ├── run_perfspec.py
│   └── train_perfspec.py
├── models Extracted from compress archive
│   ├── checkpoints
│   │   ├── model_at_epoch_175.keras
│   │   └── model_at_epoch_185.keras
│   ├── history.json
│   └── perfSPEC_model.keras
├── models_sample.tar.xz Comperss archive with 'models'
├── presentacion.pdf Presentation slides
└── raw-audit-logs.log.xz Main Raw Logs file
</pre>
</details>
</pre>
A [full directory layout](full_content_layout.md) is available.
As some tasks can be used in [Python](https://python.org) or [Rust](https://www.rust-lang.org/) there are or will be directories for each programming languge inside directories tasks.
@ -101,14 +80,48 @@ Each `task/programming-language` use a common __data__ directory where processin
If you wish to [collect](collect) your own dataset, there are several source files that might help:
- `collect/audit-policy.yaml` is for [Kubernetes](https://kubernetes.io/) event logs capture, other resources are also required: [adminssion controllers](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/), etc
- `collect/collect.py` is a script to trigger the installation and uninstallation of public Helm repositories.
- `collect/helm-charts.json` is a backup of Helm charts used at the time of the collection.
## Process data
raw-audit-logs.log
main-audit-logs.log
actions-dataset-audit.txt
## Learning
actions_distribution.pdf
## Data Models
> [!CAUTION]
> These files are default names and paths, can be changed:
> - by [settings](learning/python/lib_perfspec.py) modifications
> - by <u>command-line</u> in running script mode. Add **--help* for more info
`models/checkpoints` is where files are stored as part of learning process:
<pre>
├── checkpoints
│ ── model_at_epoch_175.keras
└── model_at_epoch_185.keras
</pre>
`models/perfSPEC_model.keras` is the generated model by default
`models/history.json` is model history with stats
## Learning Notebooks
`learning/python/lib_perfspec.py` Main library with settings
`learning/python/prepare_perfspec.py` Prepare data from raw to source for learning models
`learning/python/train_perfspec.py` To train model from data
`learning/python/run_perfspec.py` To run/check predictions
`learning/python/model_perfspec.py` To inspect / review generated models
<small> __ pycache __ is for Python execution, is ignored in git tasks.</small>
## Reference