Light Mode

Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
This repository was archived by the owner on Dec 19, 2025. It is now read-only.

randogoth/lyagushka

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

1 Commit

Repository files navigation

Repository moved to codeberg.org/randogoth/lyagushka.git

lyagushka

(Russian liagushka [ljI'gusk@]: frog)

Lyagushka is a Rust command-line tool inspired by Fatum Project's 'Zhaba' algorithm (Russian 'zhaba': toad) and expands upon it for more versatility.

It is an algorithm that analyzes a one-dimensional dataset of integers to identify clusters of closely grouped "attractor" points and significant "void" gaps between these clusters. It calculates z-scores for each cluster or gap to measure their statistical significance relative to the dataset's mean density and distance between points. The analysis results, including attractors, voids, and their z-scores, are output as a JSON string.

Building

With a Rust and Cargo environment set up, simply run:

cargo build --release

To also compile a Python wheel, you need Maturin set up:

pipenv install
pipenv shell
maturin build --release
pip install target/wheels/lyagushka-1.1.0*.whl

Usage

Parameters

  • filename.txt (optional): A file containing a newline-separated list of integers to analyze. If not provided, the program expects input from stdin.
  • factor: A floating-point value by which the mean density/span is multiplied to make up a threshold for attractor and void detection.
  • min_cluster_size: An integer specifying the minimum number of contiguous points required to be considered a cluster.

Output

The tool outputs a JSON string that includes details about the identified attractors and voids, along with their respective z-scores. Here's an example of the JSON output format:

[
//...
{
"elements": [ 722, 722, 722, 725, 725, 726, 726, 726],
"start": 722,
"end": 726,
"span_length": 4,
"num_elements": 8,
"centroid": 724.0,
"z_score": 1.19528
},
{
"elements": [],
"start": 732,
"end": 740,
"span_length": 8,
"num_elements": 0,
"centroid": 736.0,
"z_score": -1.13359
},
//...
]

From a File

To analyze a dataset from a file, provide the filename as an argument, followed by the factor and minimum cluster size parameters

lyagushka random_values.txt 1.5 6

(= 'Attractor clusters need to have at least 6 numbers with 1.5 times the mean density, void gaps need to be at leat 1.5 times the mean gap size wide')

From Stdin

Alternatively, you can pipe a list of integers into the tool, followed by the factor and minimum cluster size.

cat random_values.txt | lyagushka 0.5 2

About

The algorithm identifies clusters and gaps in integer datasets, calculates their Z-scores based on mean density and distance, and outputs the results as JSON.

Topics

Resources

Readme

Stars

Watchers

Forks

Contributors