# Beeble Studio: Technical Analysis **Date**: January 2026 **Subject**: Beeble Studio desktop application (Linux x86_64 RPM) **Scope**: Identification of third-party components and architectural analysis of the application's AI pipeline ## 1. Introduction Beeble Studio is a desktop application for VFX professionals that generates physically-based rendering (PBR) passes from video footage. It produces alpha mattes (background removal), depth maps, normal maps, base color, roughness, specular, and metallic passes, along with AI-driven relighting capabilities. Beeble markets its pipeline as being "Powered by SwitchLight 3.0," their proprietary video-to-PBR model published at CVPR 2024. The application is sold as a subscription product, with plans starting at $42/month. This analysis was prompted by observing that several of Beeble Studio's output passes closely resemble the outputs of well-known open-source models. Standard forensic techniques--string extraction from process memory, TensorRT plugin analysis, PyInstaller module listing, Electron app inspection, and manifest analysis--were used to determine which components the application actually contains and how they are organized. ## 2. Findings summary The analysis identified four open-source models used directly for user-facing outputs, a complete open-source detection and tracking pipeline used for preprocessing, additional open-source architectural components, and a proprietary model whose architecture raises questions about how "proprietary" should be understood. | Pipeline stage | Component | License | Open source | |---------------|-----------|---------|-------------| | Background removal (alpha) | transparent-background / InSPyReNet | MIT | Yes | | Depth estimation | Depth Anything V2 via Kornia | Apache 2.0 | Yes | | Person detection | RT-DETR via Kornia | Apache 2.0 | Yes | | Face detection | Kornia face detection | Apache 2.0 | Yes | | Multi-object tracking | BoxMOT via Kornia | MIT | Yes | | Edge detection | DexiNed via Kornia | Apache 2.0 | Yes | | Feature extraction | DINOv2 via timm | Apache 2.0 | Yes | | Segmentation | segmentation_models_pytorch | MIT | Yes | | Backbone architecture | PP-HGNet via timm | Apache 2.0 | Yes | | Super resolution | RRDB-Net via Kornia | Apache 2.0 | Yes | | PBR decomposition / relighting | SwitchLight 3.0 | Proprietary | See section 4 | The preprocessing pipeline--background removal, depth estimation, feature extraction, segmentation--is composed entirely of open-source models used off the shelf. The PBR decomposition and relighting stage is marketed as "SwitchLight 3.0." The CVPR 2024 paper describes it as a physics-based inverse rendering system with dedicated sub-networks (Normal Net, Specular Net) and a Cook-Torrance reflectance model. However, the application binary contains no references to any of this physics-based terminology, and the architectural evidence suggests the models are built from standard encoder-decoder segmentation frameworks with pretrained backbones from timm. This is discussed in detail in section 4. The reconstructed pipeline architecture: ``` Input Video Frame | +--[RT-DETR + PP-HGNet]-----------> Person Detection | | | +--[BoxMOT]---------------> Tracking (multi-frame) | +--[Face Detection]---------------> Face Regions | +--[InSPyReNet]-------------------> Alpha Matte | +--[Depth Anything V2]------------> Depth Map | +--[DINOv2]-------> Feature Maps | | +--[segmentation_models_pytorch]---> Segmentation | +--[DexiNed]----------------------> Edge Maps | +--[SMP encoder-decoder + PP-HGNet/ResNet backbone] | | | +----> Normal Map | +----> Base Color | +----> Roughness | +----> Specular | +----> Metallic | +--[RRDB-Net]---------------------> Super Resolution | +--[Relighting model]-------------> Relit Output ``` Each stage runs independently. The Electron app passes separate CLI flags (`--run-alpha`, `--run-depth`, `--run-pbr`) to the engine binary, and each flag can be used in isolation. This is not a unified end-to-end model--it is a pipeline of independent models. The detection and tracking stages (RT-DETR, BoxMOT, face detection) serve as preprocessing--locating and tracking subjects across frames before the extraction models run. ## 3. Evidence for each component ### 3.1 Background removal: transparent-background / InSPyReNet The complete API docstring for the `transparent-background` Python package was found verbatim in process memory: ``` Args: img (PIL.Image or np.ndarray): input image type (str): output type option as below. 'rgba' will generate RGBA output regarding saliency score as an alpha map. 'green' will change the background with green screen. 'white' will change the background with white color. '[255, 0, 0]' will change the background with color code [255, 0, 0]. 'blur' will blur the background. 'overlay' will cover the salient object with translucent green color, and highlight the edges. Returns: PIL.Image: output image ``` This is a character-for-character match with the docstring published at https://github.com/plemeri/transparent-background. Additionally, TensorRT layer names found in the binary correspond to Res2Net bottleneck blocks (`RnRes2Br1Br2c_TRT`, `RnRes2Br2bBr2c_TRT`, `RnRes2FullFusion_TRT`), which is the backbone architecture used by InSPyReNet. The `transparent_background.backbones.SwinTransformer` module path was also found in the PyInstaller bundle's module list. - **Library**: transparent-background (`pip install transparent-background`) - **Model**: InSPyReNet (Kim et al., ACCV 2022) - **License**: MIT - **Paper**: https://arxiv.org/abs/2209.09475 - **Repository**: https://github.com/plemeri/transparent-background ### 3.2 Depth estimation: Depth Anything V2 The complete API documentation for Depth Anything V2's ONNX export interface was found in process memory: ``` Export a DepthAnything model to an ONNX model file. Args: model_name: The name of the model to be loaded. Valid model names include: - `depth-anything-v2-small` - `depth-anything-v2-base` - `depth-anything-v2-large` model_type: The type of the model to be loaded. Valid model types include: - `model` - `model_bnb4` - `model_fp16` - `model_int8` ``` This is accessed through Kornia's ONNX builder interface (`kornia.onnx.DepthAnythingONNXBuilder`), with 50+ additional references to Kornia's tutorials and modules throughout the binary. - **Library**: Kornia (`pip install kornia`) - **Model**: Depth Anything V2 (Yang et al., 2024) - **License**: Apache 2.0 - **Paper**: https://arxiv.org/abs/2406.09414 - **Repository**: https://github.com/kornia/kornia ### 3.3 Feature extraction: DINOv2 Multiple references to DINOv2 were found across the application: - Runtime warning: `WARNING:dinov2:xFormers not available` (captured from application output during normal operation) - Model checkpoint URLs: `dinov2_vits14_pretrain.pth`, `dinov2_vitb14_pretrain.pth` (Meta's public model hosting) - timm model registry name: `vit_large_patch14_dinov2.lvd142m` - File path: `/mnt/work/Beeble_Models/lib/timm/models/hrnet.py` (timm library bundled in the application) DINOv2 is Meta's self-supervised vision transformer. It does not produce a user-facing output directly--it generates feature maps that feed into downstream models. This is a standard pattern in modern computer vision: use a large pretrained backbone for feature extraction, then train smaller task-specific heads on top. - **Library**: timm (`pip install timm`) - **Model**: DINOv2 (Oquab et al., Meta AI, 2023) - **License**: Apache 2.0 - **Paper**: https://arxiv.org/abs/2304.07193 - **Repository**: https://github.com/huggingface/pytorch-image-models ### 3.4 Segmentation: segmentation_models_pytorch A direct reference to the library's GitHub repository, encoder/decoder architecture parameters, and decoder documentation was found in process memory: ``` encoder_name: Name of the encoder to use. encoder_depth: Depth of the encoder. decoder_channels: Number of channels in the decoder. decoder_name: What decoder to use. https://github.com/qubvel-org/segmentation_models.pytorch/ tree/main/segmentation_models_pytorch/decoders Note: Only encoder weights are available. Pretrained weights for the whole model are not available. ``` This library is a framework for building encoder-decoder segmentation models. It is not a model itself--it provides the architecture (UNet, FPN, DeepLabV3, etc.) into which you plug a pretrained encoder backbone (ResNet, EfficientNet, etc.) and train the decoder on your own data for your specific task. Its presence alongside the pretrained backbones described below suggests it serves as the architectural foundation for one or more of the PBR output models. This is discussed further in section 4. - **Library**: segmentation_models_pytorch (`pip install segmentation-models-pytorch`) - **License**: MIT - **Repository**: https://github.com/qubvel-org/segmentation_models.pytorch ### 3.5 Backbone: PP-HGNet The `HighPerfGpuNet` class was found in process memory along with its full structure: ``` HighPerfGpuNet HighPerfGpuNet.forward_features HighPerfGpuNet.reset_classifier HighPerfGpuBlock.__init__ LearnableAffineBlock ConvBNAct.__init__ StemV1.__init__ StemV1.forward _create_hgnetr ``` This is PP-HGNet (PaddlePaddle High Performance GPU Network), ported to timm's model registry. Documentation strings confirm the identity: ``` PP-HGNet (V1 & V2) PP-HGNetv2: https://github.com/PaddlePaddle/PaddleClas/ .../pp_hgnet_v2.py ``` PP-HGNet is a convolutional backbone architecture designed for fast GPU inference, originally developed for Baidu's RT-DETR real-time object detection system. It is available as a pretrained backbone through timm and is commonly used as an encoder in larger models. PP-HGNet serves a dual role in the Beeble pipeline. First, it functions as the backbone encoder for the RT-DETR person detection model (see section 3.6). Second, based on the co-presence of `segmentation_models_pytorch` and compatible encoder interfaces, it likely serves as one of the backbone encoders for the PBR decomposition models. This dual use is standard--the same pretrained backbone can be loaded into different model architectures for different tasks. - **Library**: timm (`pip install timm`) - **Model**: PP-HGNet (Baidu/PaddlePaddle) - **License**: Apache 2.0 - **Repository**: https://github.com/huggingface/pytorch-image-models ### 3.6 Detection and tracking pipeline The binary contains a complete person detection and tracking pipeline built from open-source models accessed through Kornia. **RT-DETR (Real-Time Detection Transformer).** Full module paths for RT-DETR were found in the binary: ``` kornia.contrib.models.rt_detr.architecture.hgnetv2 kornia.contrib.models.rt_detr.architecture.resnet_d kornia.contrib.models.rt_detr.architecture.rtdetr_head kornia.contrib.models.rt_detr.architecture.hybrid_encoder kornia.models.detection.rtdetr ``` RT-DETR model configuration strings confirm the PP-HGNet connection: ``` Configuration to construct RT-DETR model. - HGNetV2-L: 'hgnetv2_l' or RTDETRModelType.hgnetv2_l - HGNetV2-X: 'hgnetv2_x' or RTDETRModelType.hgnetv2_x ``` RT-DETR is Baidu's real-time object detection model, published at ICLR 2024. It detects and localizes objects (including persons) in images. In Beeble's pipeline, it likely serves as the initial stage that identifies which regions of the frame contain subjects to process. - **Model**: RT-DETR (Zhao et al., 2024) - **License**: Apache 2.0 (via Kornia) - **Paper**: https://arxiv.org/abs/2304.08069 **Face detection.** The `kornia.contrib.face_detection` module and `kornia.contrib.FaceDetectorResult` class were found in the binary. This provides face region detection, likely used to guide the PBR models in handling facial features (skin, eyes, hair) differently from other body parts or clothing. **BoxMOT (multi-object tracking).** The module path `kornia.models.tracking.boxmot_tracker` was found in the binary. BoxMOT is a multi-object tracking library that maintains identity across video frames--given detections from RT-DETR on each frame, BoxMOT tracks which detection corresponds to which person over time. - **Repository**: https://github.com/mikel-brostrom/boxmot - **License**: MIT (AGPL-3.0 for some trackers, MIT for others) The presence of a full detection-tracking pipeline is notable because it means the video processing is not a single model operating on raw frames. The pipeline first detects and tracks persons, then runs the extraction models on the detected regions. This is a standard computer vision approach, and every component in this preprocessing chain is open-source. ### 3.7 Edge detection and super resolution Two additional open-source models were found: **DexiNed (edge detection).** The module path `kornia.models.edge_detection.dexined` was found in the binary. DexiNed (Dense Extreme Inception Network for Edge Detection) is a CNN-based edge detector. It likely produces edge maps used as auxiliary input or guidance for other models in the pipeline. - **Model**: DexiNed (Soria et al., 2020) - **License**: Apache 2.0 (via Kornia) **RRDB-Net (super resolution).** The module path `kornia.models.super_resolution.rrdbnet` was found in the binary. RRDB-Net (Residual-in-Residual Dense Block Network) is the backbone of ESRGAN, the widely-used super resolution model. This is likely used to upscale PBR passes to the output resolution. - **Model**: RRDB-Net / ESRGAN (Wang et al., 2018) - **License**: Apache 2.0 (via Kornia) ### 3.8 TensorRT plugins and quantized backbones Several custom TensorRT plugins were found compiled for inference: - `DisentangledAttention_TRT` -- a custom TRT plugin implementing DeBERTa-style disentangled attention (He et al., Microsoft, 2021). The `_TRT` suffix indicates this is compiled for production inference, not just a bundled library. This suggests a transformer component in the pipeline that uses disentangled attention to process both content and position information separately. - `GridAnchorRect_TRT` -- anchor generation for object detection. Combined with the RT-DETR and face detection references, this confirms that the pipeline includes a detection stage. Multiple backbone architectures were found with TensorRT INT8 quantization and stage-level fusion optimizations: ``` int8_resnet50_stage_1_4_fusion int8_resnet50_stage_2_fusion int8_resnet50_stage_3_fusion int8_resnet34_stage_1_4_fusion int8_resnet34_stage_2_fusion int8_resnet34_stage_3_fusion int8_resnext101_backbone_fusion ``` This shows that ResNet-34, ResNet-50, and ResNeXt-101 are compiled for inference at INT8 precision with stage-level fusion optimizations. These are standard pretrained backbones available from torchvision and timm. ### 3.9 Additional libraries The binary contains references to supporting libraries that are standard in ML applications: | Library | License | Role | |---------|---------|------| | PyTorch 2.8.0+cu128 | BSD 3-Clause | Core ML framework | | TensorRT 10 | NVIDIA proprietary | Model compilation and inference | | OpenCV 4.11.0.86 (with Qt5, FFmpeg) | Apache 2.0 | Image processing | | timm 1.0.15 | Apache 2.0 | Model registry and backbones | | Albumentations | MIT | Image augmentation | | Pillow | MIT-CMU | Image I/O | | HuggingFace Hub | Apache 2.0 | Model downloading | | gdown | MIT | Google Drive file downloading | | NumPy, SciPy | BSD | Numerical computation | | Hydra / OmegaConf | MIT | ML configuration management | | einops | MIT | Tensor manipulation | | safetensors | Apache 2.0 | Model weight format | | Flet | Apache 2.0 | Cross-platform GUI framework | | SoftHSM2 / PKCS#11 | BSD 2-Clause | License token validation | | OpenSSL 1.1 | Apache 2.0 | Cryptographic operations | Two entries deserve mention. **Pyarmor** (runtime ID `pyarmor_runtime_007423`) is used to encrypt all of Beeble's custom Python code--every proprietary module is obfuscated with randomized names and encrypted bytecode. This prevents static analysis of how models are orchestrated. **Flet** is the GUI framework powering the Python-side interface. ## 4. Architecture analysis This section presents evidence about how the PBR decomposition model is constructed. The findings here are more inferential than those in section 3--they are based on the absence of expected evidence and the presence of architectural patterns, rather than on verbatim string matches. The distinction matters, and we draw it clearly. ### 4.1 What the CVPR 2024 paper describes The SwitchLight CVPR 2024 paper describes a physics-based inverse rendering architecture with several dedicated components: - A **Normal Net** that estimates surface normals - A **Specular Net** that predicts specular reflectance properties - Analytical **albedo derivation** using a Cook-Torrance BRDF model - A **Render Net** that performs the final relighting - Spherical harmonics for environment lighting representation This is presented as a unified system where intrinsic decomposition (breaking an image into its physical components) is an intermediate step in the relighting pipeline. The paper's novelty claim rests partly on this physics-driven architecture. ### 4.2 What the binary contains A thorough string search of the 2GB process memory dump and the 56MB application binary found **zero** matches for the following terms: - `cook_torrance`, `cook-torrance`, `Cook_Torrance`, `CookTorrance` - `brdf`, `BRDF` - `albedo` - `specular_net`, `normal_net`, `render_net` - `lightstage`, `light_stage`, `OLAT` - `environment_map`, `env_map`, `spherical_harmonic`, `SH_coeff` - `inverse_rendering`, `intrinsic_decomposition` - `relight` (as a function or class name) - `switchlight`, `SwitchLight` (in any capitalization) Not one of these terms appears anywhere in the application. The absence of "SwitchLight" deserves emphasis. This term was searched across three independent codebases: 1. The `beeble-ai` engine binary (56 MB) -- zero matches 2. The `beeble-engine-setup` binary (13 MB) -- zero matches 3. All 667 JavaScript files in the Electron app's `dist/` directory -- zero matches "SwitchLight" is purely a marketing name. It does not appear as a model name, a class name, a configuration key, a log message, or a comment anywhere in the application. By contrast, open-source component names appear throughout the binary because they are real software identifiers used by real code. "SwitchLight" is not used by any code at all. This is a significant absence. When an application uses a library or implements an algorithm, its terminology appears in memory through function names, variable names, error messages, logging, docstrings, or class definitions. The open-source components (InSPyReNet, Depth Anything, DINOv2, RT-DETR, BoxMOT) are all identifiable precisely because their terminology is present. The physics-based rendering vocabulary described in the CVPR paper is entirely absent. There is a caveat: Beeble encrypts its custom Python code with Pyarmor, which encrypts bytecode and obfuscates module names. If the Cook-Torrance logic exists only in Pyarmor-encrypted modules, its terminology would not be visible to string extraction. However, TensorRT layer names, model checkpoint references, and library-level strings survive Pyarmor encryption--and none of those contain physics-based rendering terminology either. ### 4.3 What the binary contains instead Where you would expect physics-based rendering components, the binary shows standard machine learning infrastructure: - **segmentation_models_pytorch** -- an encoder-decoder segmentation framework designed for dense pixel prediction tasks. It provides architectures (UNet, FPN, DeepLabV3) that take pretrained encoder backbones and learn to predict pixel-level outputs. - **PP-HGNet, ResNet-34, ResNet-50, ResNeXt-101** -- standard pretrained backbone architectures, all available from timm. These are the encoders that plug into segmentation_models_pytorch. - **DINOv2** -- a self-supervised feature extractor that provides rich visual features as input to downstream models. - **DisentangledAttention** -- a transformer attention mechanism, compiled as a custom TRT plugin for inference. This is the standard toolkit for building dense prediction models in computer vision. You pick an encoder backbone, connect it to a segmentation decoder, and train the resulting model to predict whatever pixel-level output you need--whether that is semantic labels, depth values, or normal vectors. ### 4.4 What the Electron app reveals The application's Electron shell (the UI layer that orchestrates the Python engine) is not encrypted and provides clear evidence about the pipeline structure. The engine binary receives independent processing flags: - `--run-alpha` -- generates alpha mattes - `--run-depth` -- generates depth maps - `--run-pbr` -- generates BaseColor, Normal, Roughness, Specular, Metallic Each flag can be used in isolation. A user can request alpha without depth, or depth without PBR. The Electron app constructs these flags independently based on user selections. A session-start log entry captured in process memory confirms this separation: ```json { "extra_command": "--run-pbr --run-alpha --run-depth --save-exr --pbr-stride 1,2 --fps 24.0 --engine-version r1.3.0-m1.1.1" } ``` The `--pbr-stride 1,2` flag is notable. It indicates that PBR passes are not processed on every frame--they use a stride, processing a subset of frames and presumably interpolating the rest. This contradicts the "true end-to-end video model that understands motion natively" claim on Beeble's research page. A model that truly processes video end-to-end would not need to skip frames. ### 4.5 What this suggests The evidence points to a specific conclusion: the PBR decomposition model is most likely a standard encoder-decoder segmentation model (segmentation_models_pytorch architecture) with pretrained backbones (PP-HGNet, ResNet, DINOv2), trained on Beeble's private dataset to predict PBR channels as its output. This is a common and well-understood approach in computer vision. You take a pretrained backbone, attach a decoder, and train the whole model on your task-specific data using task-specific losses. The Cook-Torrance reflectance model described in the CVPR paper would then be a *training-time loss function*--used to compute the error between predicted and ground-truth renders during training--rather than an architectural component that exists at inference time. This distinction matters because it changes what "Powered by SwitchLight 3.0" actually means. The CVPR paper's framing suggests a novel physics-driven architecture. The binary evidence suggests standard open-source architectures trained with proprietary data. The genuine proprietary elements are the training methodology, the lightstage training data, and the trained weights--not the model architecture itself. We want to be clear about the limits of this inference. The Pyarmor encryption prevents us from seeing the actual pipeline code, and the TensorRT engines inside the encrypted `.enc` model files do not expose their internal layer structure through string extraction. It is possible, though we think unlikely, that the physics-based rendering code exists entirely within the encrypted layers and uses no standard terminology. We present this analysis as our best reading of the available evidence, not as a certainty. ## 5. Code protection Beeble uses two layers of protection to obscure its pipeline: **Model encryption.** The six model files are stored as `.enc` files encrypted with AES. They total 4.4 GB: | File | Size | |------|------| | 97b0085560.enc | 1,877 MB | | b001322340.enc | 1,877 MB | | 6edccd5753.enc | 351 MB | | e710b0c669.enc | 135 MB | | 0d407dcf32.enc | 111 MB | | 7f121ea5bc.enc | 49 MB | The filenames are derived from their SHA-256 hashes. No metadata in the manifest indicates what each model does. However, comparing file sizes against known open-source model checkpoints is suggestive: - The 351 MB file closely matches the size of a DINOv2 ViT-B checkpoint (~346 MB for `dinov2_vitb14_pretrain.pth`) - The two ~1,877 MB files are nearly identical in size (within 1 MB of each other), suggesting two variants of the same model compiled to TensorRT engines--possibly different precision levels or input resolution configurations - The smaller files (49 MB, 111 MB, 135 MB) are consistent with single-task encoder-decoder models compiled to TensorRT with INT8 quantization **Code obfuscation.** All custom Python code is encrypted with Pyarmor. Module names are randomized (`q47ne3pa`, `qf1hf17m`, `vk3zuv58`) and bytecode is decrypted only at runtime. The application contains approximately 82 obfuscated modules across three main packages, with the largest single module being 108 KB. This level of protection is unusual for a desktop application in the VFX space, and it is worth understanding what it does and does not hide. Pyarmor prevents reading the pipeline orchestration code--how models are loaded, connected, and run. But it does not hide which libraries are loaded into memory, which TensorRT plugins are compiled, or what command-line interface the engine exposes. Those are the evidence sources this analysis relies on. ## 6. Beeble's public claims Beeble's marketing consistently attributes the entire Video-to-VFX pipeline to SwitchLight. The following are exact quotes from their public pages (see [evidence/marketing_claims.md](../evidence/marketing_claims.md) for the complete archive). **Beeble Studio product page** (beeble.ai/beeble-studio): > Powered by **SwitchLight 3.0**, convert images and videos into > **full PBR passes with alpha and depth maps** for seamless > relighting, background removal, and advanced compositing. **SwitchLight 3.0 research page** (beeble.ai/research/switchlight-3-0-is-here): > SwitchLight 3.0 is the best Video-to-PBR model in the world. > SwitchLight 3.0 is a **true end-to-end video model** that > understands motion natively. **Documentation FAQ** (docs.beeble.ai/help/faq): On the "What is Video-to-VFX?" question: > **Video-to-VFX** uses our foundation model, **SwitchLight 3.0**, > and SOTA AI models to convert your footage into VFX-ready assets. On the "Is Beeble's AI trained responsibly?" question: > When open-source models are included, we choose them > carefully--only those with published research papers that disclose > their training data and carry valid commercial-use licenses. The FAQ is the only public place where Beeble acknowledges the use of open-source models. The product page and research page present the entire pipeline as "Powered by SwitchLight 3.0" without distinguishing which output passes come from SwitchLight versus third-party open-source models. ### Investor-facing claims Beeble raised a $4.75M seed round in July 2024 at a reported $25M valuation, led by Basis Set Ventures and Fika Ventures. At the time, the company had approximately 7 employees. Press coverage of the funding consistently uses language like "foundational model" and "world-class foundational model in lighting" to describe SwitchLight--language that implies a novel, proprietary system rather than a pipeline of open-source components with proprietary weights. These investor-facing claims were made through public press releases and coverage, not private communications. They are relevant because they represent how Beeble chose to characterize its technology to the market. See [evidence/marketing_claims.md](../evidence/marketing_claims.md) for archived quotes. The "true end-to-end video model" claim is particularly difficult to reconcile with the evidence. The application processes alpha, depth, and PBR as independent stages using separate CLI flags. PBR processing uses a frame stride (`--pbr-stride 1,2`), skipping frames rather than processing video natively. This is a pipeline of separate models, not an end-to-end video model. ## 7. What Beeble does well This analysis would be incomplete without acknowledging what is genuinely Beeble's own work. **SwitchLight is published research.** The CVPR 2024 paper describes a real methodology for training intrinsic decomposition models using lightstage data and physics-based losses. Whether the deployed architecture matches the paper's description is a separate question from whether the research itself has merit. It does. **The trained weights are real work.** If the PBR model is built on standard architectures (as the evidence suggests), the value lies in the training data and training process. Acquiring lightstage data, designing loss functions, and iterating on model quality is substantial work. Pretrained model weights trained on high-quality domain-specific data are genuinely valuable, even when the architecture is standard. **TensorRT compilation is non-trivial engineering.** Converting PyTorch models to TensorRT engines with INT8 quantization for real-time inference requires expertise. The application runs at interactive speeds on consumer GPUs with 11 GB+ VRAM. **The product is a real product.** The desktop application, Nuke/ Blender/Unreal integrations, cloud API, render queue, EXR output with ACEScg color space support, and overall UX represent substantial product engineering. ## 8. The real question Most Beeble Studio users use the application for PBR extractions: alpha mattes, diffuse/albedo, normals, and depth maps. The relighting features exist but are secondary to the extraction workflow for much of the user base. The alpha and depth extractions are produced by open-source models used off the shelf. They can be replicated for free using the exact same libraries. The PBR extractions (normal, base color, roughness, specular, metallic) use models whose trained weights are proprietary, but whose architecture appears to be built from the same open-source frameworks available to anyone. Open-source alternatives for PBR decomposition now exist (CHORD from Ubisoft, RGB-X from Adobe) and are narrowing the quality gap, though they were trained on different data and may perform differently on portrait subjects. See [COMFYUI_GUIDE.md](COMFYUI_GUIDE.md) for a detailed guide on replicating each stage of the pipeline with open-source tools. There is a common assumption that the training data represents a significant barrier to replication--that lightstage captures are expensive and rare, and therefore the trained weights are uniquely valuable. This may overstate the difficulty. For PBR decomposition training, what you need is a dataset of images paired with ground-truth PBR maps (albedo, normal, roughness, metallic). Modern 3D character pipelines--Unreal Engine MetaHumans, Blender character generators, procedural systems in Houdini--can render hundreds of thousands of such pairs with varied poses, lighting, skin tones, and clothing. The ground truth is inherent: you created the scene, so you already have the PBR maps. With model sizes under 2 GB and standard encoder-decoder architectures, the compute cost to train equivalent models from synthetic data is modest. None of this means Beeble has no value. Convenience, polish, and integration are real things people pay for. But the gap between what the marketing says ("Powered by SwitchLight 3.0") and what the application contains (a pipeline of mostly open-source components, some used directly and others used as architectural building blocks) is wider than what users would reasonably expect. And the technical moat may be thinner than investors were led to believe. ## 9. License compliance All identified open-source components require attribution in redistributed software. Both the MIT License and Apache 2.0 License require that copyright notices and license texts be included with any distribution of the software. No such attribution was found in Beeble Studio's application, documentation, or user-facing materials. The scope of the issue extends beyond the core models. The application bundles approximately 48 Python packages in its `lib/` directory. Of these, only 6 include LICENSE files (cryptography, gdown, MarkupSafe, numpy, openexr, triton). The remaining 42 packages--including PyTorch, Kornia, Pillow, and others with attribution requirements--have no license files in the distribution. For a detailed analysis of each license's requirements and what compliance would look like, see [LICENSE_ANALYSIS.md](LICENSE_ANALYSIS.md). ## 10. Conclusion Beeble Studio's Video-to-VFX pipeline is a collection of independent models, most built from open-source components. The preprocessing stages are entirely open-source: background removal (InSPyReNet), depth estimation (Depth Anything V2), person detection (RT-DETR with PP-HGNet), face detection (Kornia), multi-object tracking (BoxMOT), edge detection (DexiNed), and super resolution (RRDB-Net). The PBR decomposition models appear to be built on open-source architectural frameworks (segmentation_models_pytorch, timm backbones) with proprietary trained weights. The name "SwitchLight" does not appear anywhere in the application-- not in the engine binary, not in the setup binary, not in the Electron app's 667 JavaScript files. It is a marketing name that refers to no identifiable software component. The CVPR 2024 paper describes a physics-based inverse rendering architecture. The deployed application contains no evidence of physics-based rendering code at inference time. The most likely explanation is that the physics (Cook-Torrance rendering) was used during training as a loss function, and the deployed model is a standard feedforward network that learned to predict PBR channels from that training process. Beeble's marketing attributes the entire pipeline to SwitchLight 3.0. The evidence shows that alpha mattes come from InSPyReNet, depth maps come from Depth Anything V2, person detection comes from RT-DETR, tracking comes from BoxMOT, and the PBR models are built on segmentation_models_pytorch with PP-HGNet and ResNet backbones. The "true end-to-end video model" claim is contradicted by the independent processing flags and frame stride parameter observed in the application. Of the approximately 48 Python packages bundled with the application, only 6 include license files. The core open-source models' licenses require attribution that does not appear to be provided. These findings can be independently verified using the methods described in [VERIFICATION_GUIDE.md](VERIFICATION_GUIDE.md).