Version evolution (SL 1.0→2.0→3.0), team background, no patents, NVIDIA DiffusionRenderer as open-source competitor, dataset landscape (POLAR, SynthLight, etc.), botocore/AWS SDK in privacy app, MetaHuman EULA fix, user data controversy, and DiffusionRenderer ComfyUI integration across all docs.
188 lines
7.8 KiB
Markdown
188 lines
7.8 KiB
Markdown
# Methodology
|
|
|
|
This document describes how the analysis was performed and, equally
|
|
important, what was not done.
|
|
|
|
|
|
## Approach
|
|
|
|
The analysis used standard forensic techniques that any security
|
|
researcher or system administrator would recognize. No proprietary
|
|
code was reverse-engineered, no encryption was broken, and no
|
|
software was decompiled.
|
|
|
|
Five complementary methods were used, each revealing different
|
|
aspects of the application's composition.
|
|
|
|
|
|
## What was done
|
|
|
|
### 1. String extraction from process memory
|
|
|
|
The core technique. When a Linux application runs, its loaded
|
|
libraries, model metadata, and configuration data are present in
|
|
process memory as readable strings. The `strings` command and
|
|
standard text search tools extract these without interacting with the
|
|
application's logic in any way.
|
|
|
|
This is the same technique used in malware analysis, software
|
|
auditing, and license compliance verification across the industry.
|
|
It reveals what libraries and models are loaded, but not how they
|
|
are used or what proprietary code does with them.
|
|
|
|
Extracted strings were searched for known identifiers--library names,
|
|
model checkpoint filenames, Python package docstrings, API
|
|
signatures, and TensorRT layer names that correspond to published
|
|
open-source projects. Each match was compared against the source code
|
|
of the corresponding open-source project to confirm identity.
|
|
|
|
### 2. TensorRT plugin analysis
|
|
|
|
TensorRT plugins are named components compiled for GPU inference.
|
|
Their names appear in the binary and reveal which neural network
|
|
operations are being used. Standard plugins (like convolution or
|
|
batch normalization) are not informative, but custom plugins with
|
|
distinctive names--like `DisentangledAttention_TRT` or
|
|
`RnRes2FullFusion_TRT`--identify specific architectures.
|
|
|
|
Plugin names, along with quantization patterns (e.g.,
|
|
`int8_resnet50_stage_2_fusion`), indicate which backbone
|
|
architectures have been compiled for production inference and at
|
|
what precision.
|
|
|
|
### 3. PyInstaller module listing
|
|
|
|
The `beeble-ai` binary is a PyInstaller-packaged Python application.
|
|
PyInstaller bundles Python modules into an archive whose table of
|
|
contents is readable without executing the application. This reveals
|
|
which Python packages are bundled, including both open-source
|
|
libraries and obfuscated proprietary modules.
|
|
|
|
The module listing identified 7,132 bundled Python modules, including
|
|
the Pyarmor runtime used to encrypt Beeble's custom code. The
|
|
obfuscated module structure (three main packages with randomized
|
|
names, totaling approximately 82 modules) reveals the approximate
|
|
scope of the proprietary code.
|
|
|
|
### 4. Electron app inspection
|
|
|
|
Beeble Studio's desktop UI is an Electron application. The compiled
|
|
JavaScript code in the `dist/` directory is not obfuscated and
|
|
reveals how the UI orchestrates the Python engine binary. This
|
|
analysis examined:
|
|
|
|
- CLI flag construction (what arguments are passed to the engine)
|
|
- Database schema (what data is stored about jobs and outputs)
|
|
- Output directory structure (what files the engine produces)
|
|
- Progress reporting (what processing stages the engine reports)
|
|
|
|
This is the source of evidence about independent processing stages
|
|
(`--run-alpha`, `--run-depth`, `--run-pbr`), the PBR frame stride
|
|
parameter, and the output channel structure.
|
|
|
|
### 5. Library directory inventory
|
|
|
|
The application's `lib/` directory contains approximately 48 Python
|
|
packages deployed alongside the main binary. These were inventoried
|
|
to determine which packages are present, their version numbers, and
|
|
whether license files are included. This is a straightforward
|
|
directory listing--no files were extracted, modified, or executed.
|
|
|
|
The inventory revealed specific library versions (PyTorch 2.8.0,
|
|
timm 1.0.15, OpenCV 4.11.0.86), confirmed which packages are
|
|
deployed as separate directories versus compiled into the PyInstaller
|
|
binary, and identified the license file gap (only 6 of 48 packages
|
|
include their license files).
|
|
|
|
|
|
### 6. Engine setup log analysis
|
|
|
|
The application's setup process produces a detailed log file that
|
|
records every file downloaded during installation. This log was
|
|
read to understand the full scope of the deployment: total file
|
|
count, total download size, and the complete list of downloaded
|
|
components. The log is generated during normal operation and does
|
|
not require any special access to read.
|
|
|
|
|
|
### 7. Manifest and public claims review
|
|
|
|
The application's `manifest.json` file, downloaded during normal
|
|
operation, was inspected for model references and metadata. Beeble's
|
|
website, documentation, FAQ, and research pages were reviewed to
|
|
understand how the technology is described to users. All public
|
|
claims were archived with URLs and timestamps.
|
|
|
|
The manifest confirms Python 3.11 as the runtime (via the presence of
|
|
`libpython3.11.so.1.0` in the downloaded files). TensorRT 10.12.0 was
|
|
also identified, and notably, builder resources are present alongside
|
|
the runtime--not just inference libraries. The presence of TensorRT
|
|
builder components suggests possible on-device model compilation,
|
|
meaning TensorRT engines may be compiled locally on the user's GPU
|
|
rather than shipped as pre-built binaries.
|
|
|
|
|
|
## What was not done
|
|
|
|
This list defines the boundaries of the analysis and establishes
|
|
that no proprietary technology was compromised.
|
|
|
|
- **No decompilation or disassembly.** The `beeble-ai` binary was
|
|
never decompiled, disassembled, or analyzed at the instruction
|
|
level. No tools like Ghidra, IDA Pro, or objdump were used to
|
|
examine executable code.
|
|
|
|
- **No encryption was broken.** Beeble encrypts its model files with
|
|
AES. Those encrypted files were not decrypted, and no attempt was
|
|
made to recover encryption keys.
|
|
|
|
- **No Pyarmor circumvention.** The Pyarmor runtime that encrypts
|
|
Beeble's custom Python code was not bypassed, attacked, or
|
|
circumvented. The analysis relied on evidence visible outside the
|
|
encrypted modules.
|
|
|
|
- **No code reverse-engineering.** The analysis did not examine how
|
|
Beeble's proprietary code works, how models are orchestrated, or
|
|
how SwitchLight processes its inputs. The only things identified
|
|
were which third-party components are present and what
|
|
architectural patterns they suggest.
|
|
|
|
- **No network interception.** No man-in-the-middle proxies or
|
|
traffic analysis tools were used to intercept communications
|
|
between the application and Beeble's servers.
|
|
|
|
- **No license circumvention.** The application was used under a
|
|
valid license. No copy protection or DRM was circumvented.
|
|
|
|
|
|
## Limitations
|
|
|
|
This analysis can identify what components are present and draw
|
|
reasonable inferences about how they are used, but it cannot see
|
|
inside the encrypted code or the encrypted model files. Several
|
|
important limitations follow:
|
|
|
|
**Architecture inference is indirect.** The conclusion that PBR
|
|
models use segmentation_models_pytorch architecture is based on
|
|
the co-presence of that framework, compatible backbones, and the
|
|
absence of alternative architectural patterns. It is not based on
|
|
direct observation of the model graph. Pyarmor encryption prevents
|
|
reading the code that connects these components.
|
|
|
|
**TensorRT engines are opaque.** The compiled model engines inside
|
|
the `.enc` files do not expose their internal layer structure to
|
|
string extraction. The TRT plugins and quantization patterns found
|
|
in the binary come from the TensorRT runtime environment, not from
|
|
inside the encrypted model files.
|
|
|
|
**Single version analyzed.** The analysis was performed on one
|
|
version of the Linux desktop application (engine version r1.3.0,
|
|
model version m1.1.1). Other versions and platforms may differ.
|
|
|
|
**String extraction is inherently noisy.** Some identified strings
|
|
may come from transient data, cached web content, or libraries
|
|
loaded but not actively used in inference. The findings focus on
|
|
strings that are unambiguous--complete docstrings, model checkpoint
|
|
URLs, TensorRT plugin registrations, and package-specific identifiers
|
|
that cannot plausibly appear by accident.
|