R&D Decision Support Virtual Lab

Project Overview

This project implements a Mini Virtual Lab that transforms vague, qualitative R&D requirements into structured, explainable, and comparable decision problems. Rather than designing materials or molecules directly, the focus is on making early-stage R&D decisions tractable, transparent, and human-centered. The framework is intentionally domain-agnostic and can be applied across materials science, pharmaceutical R&D, and other research-driven domains. At its core, the project treats problem formulation itself as a first-class, machine-readable research object.

Case Study: Early-Stage R&D Candidate Selection

This case study demonstrates how the framework transforms a vague R&D requirement into a structured, explainable decision process.

Step 1: Input (Vague R&D Requirement)

We need a promising R&D candidate that achieves good performance, but minimizing failure risk is critical in early-stage development.

Step 2: LLM-Formulated Decision Problem

{
  "objectives": [
    {
      "name": "Overall technical performance",
      "metric_key": "performance",
      "direction": "maximize",
      "weight": 0.6
    },
    {
      "name": "Risk of experimental failure",
      "metric_key": "failure_risk",
      "direction": "minimize",
      "weight": 0.4
    }
  ],
  "constraints": [],
  "recommended_stage": "early_exploration"
}

Interpretation

Failure risk is prioritized over performance, reflecting the uncertainty and cost sensitivity of early-stage R&D.
The problem formulation is optimization-ready and directly evaluable.

Step 3: Candidate Data (Synthetic Example)

All candidates share the same measurable metrics, satisfying the Validated Decision boundary.

[
  {"name": "Candidate A", "metrics": {"performance": 0.82, "failure_risk": 0.32}},
  {"name": "Candidate B", "metrics": {"performance": 0.78, "failure_risk": 0.28}},
  {"name": "Candidate C", "metrics": {"performance": 0.90, "failure_risk": 0.45}},
  {"name": "Candidate D", "metrics": {"performance": 0.74, "failure_risk": 0.30}},
  {"name": "Candidate E", "metrics": {"performance": 0.80, "failure_risk": 0.18}},
  {"name": "Candidate F", "metrics": {"performance": 0.86, "failure_risk": 0.25}}
]

Step 4: Ranking Result

Candidate E: score=0.140
Candidate F: score=0.037
Candidate B: score=-0.034
Candidate D: score=-0.065
Candidate A: score=-0.103
Candidate C: score=-0.151

Step 5: Human-Readable Decision Explanation (LLM)

Candidate E is preferred because it best minimizes failure risk, which is the most critical objective in this decision, while also offering reasonable performance. The ranking reflects a strong emphasis on reducing failure risk over maximizing performance, leading to lower-ranked candidates that may have better performance but higher risk falling behind. Candidate F, while having lower overall preference, might still be reasonable if slightly higher risk is acceptable in exchange for improved performance or other practical considerations in early development. The key trade-off is balancing the priority of safety and reliability against the desire for better performance, with risk reduction driving the choice.

System Architecture

flowchart LR
    A[Vague R&D Requirement]
    B[LLM-based Problem Formulation]
    C[Structured Decision Schema]
    D[Candidate Data]
    E[Evaluation & Ranking]
    F[Visualization]
    G[LLM-based Explanation]

    A --> B --> C --> E
    D --> E
    E --> F
    C --> G
    E --> G

ValidatedDecision Boundary (Data Contract)

This framework introduces an explicit ValidatedDecision boundary that enforces consistency between problem formulation and candidate data.

flowchart LR
    A[Problem Definition]
    B[Candidate Metrics]
    V[Validated Decision]
    R[Ranking]
    Z[Visualization]
    E[Explanation]

    A --> V
    B --> V
    V --> R
    V --> Z
    V --> E

Design Principles

Problem Formulation as a First-Class Object
Vague R&D intent is translated into explicit, optimization-ready structures consisting of objectives, priorities, and constraints.
This formulation becomes a reusable, machine-readable research artifact rather than an implicit human process.
Strict Intermediate Schema
A shared, validated schema acts as a data contract between LLM outputs, evaluation logic, and visualization modules.
This prevents silent failures and enforces consistency across the pipeline.
Decision-Ready Signals
Candidate data is treated as decision-ready signals, not raw measurements. Model predictions and experimental results are abstracted into comparable metrics that can be directly evaluated and ranked.
Human-in-the-Loop Design
Visualization and natural-language explanations are first-class outputs. The system is designed to support expert judgment by making trade-offs explicit and interpretable, rather than automating decisions blindly.
Clear Separation of Responsibilities
Each module has a single, well-defined role:
- LLMs translate human intent and explain outcomes
- Validation enforces consistency and fail-fast behavior
- Optimization logic evaluates candidates
- Visualization supports reasoning and comparison

Contributions

This project makes the following key contributions:

Problem Formulation as a Machine-Readable Research Object
Elevates decision formulation—objectives, trade-offs, and priorities—to a first-class, inspectable artifact that can be shared, validated, and reused across decision-making workflows.
An Explicit Validation Boundary for AI-Assisted Decisions
Introduces the ValidatedDecision boundary as a structural safeguard that prevents silent inconsistencies between LLM-generated problem definitions and quantitative candidate data.
Separation of Semantic Meaning and Evaluation Mechanics
Cleanly decouples human-readable intent (name) from machine-evaluable signals (metric_key), enabling robust evaluation, visualization, and explanation without relying on fragile string matching.
A Reusable Human-in-the-Loop Decision Pipeline
Combines LLM-based problem formulation and explanation with deterministic, auditable evaluation logic, supporting transparent and reproducible R&D decisions across domains.

Summary

This project demonstrates how Generative AI, structured schemas, and optimization logic can be combined to make early-stage R&D decisions explainable, comparable, and robust.

By treating problem formulation and validation as first-class components, the framework moves beyond ad-hoc AI assistance toward reproducible, human-centered decision support systems.

Slides

A concise slide deck summarizing the motivation, architecture, and contributions of this project is available in /slides.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
core		core
examples		examples
slides		slides
src		src
variants		variants
.gitignore		.gitignore
LICENSE		LICENSE
RDLab_slide.png		RDLab_slide.png
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

R&D Decision Support Virtual Lab

Project Overview

Case Study: Early-Stage R&D Candidate Selection

Step 1: Input (Vague R&D Requirement)

Step 2: LLM-Formulated Decision Problem

Interpretation

Step 3: Candidate Data (Synthetic Example)

Step 4: Ranking Result

Step 5: Human-Readable Decision Explanation (LLM)

System Architecture

ValidatedDecision Boundary (Data Contract)

Design Principles

Contributions

Summary

Slides

About

Uh oh!

Releases

Packages

Languages

License

ht55/RandD_Virtual_Lab

Folders and files

Latest commit

History

Repository files navigation

R&D Decision Support Virtual Lab

Project Overview

Case Study: Early-Stage R&D Candidate Selection

Step 1: Input (Vague R&D Requirement)

Step 2: LLM-Formulated Decision Problem

Interpretation

Step 3: Candidate Data (Synthetic Example)

Step 4: Ranking Result

Step 5: Human-Readable Decision Explanation (LLM)

System Architecture

ValidatedDecision Boundary (Data Contract)

Design Principles

Contributions

Summary

Slides

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages