Start now →

Yonsei University’s Breakthrough AI Research — Powered by AWS Trainium on Theta EdgeCloud

By Theta Labs · Published May 8, 2026 · 6 min read · Source: Blockchain Tag
AI & Crypto
Yonsei University’s Breakthrough AI Research — Powered by AWS Trainium on Theta EdgeCloud

Yonsei University’s Breakthrough AI Research — Powered by AWS Trainium on Theta EdgeCloud

Theta LabsTheta Labs5 min read·Just now

--

Two landmark papers on personalized AI reward modeling, trained on AWS Trainium instances via Theta EdgeCloud Hybrid, mark a new era for decentralized AI infrastructure in academic research.

Press enter or click to view image in full size

We are proud to announce that Yonsei University’s Data & Language Intelligence Lab — led by Professor Dongha Lee — has published two groundbreaking research papers in personalized AI reward modeling, with experiments conducted on AWS Trainium instances deployed through Theta EdgeCloud Hybrid.

These results represent a significant milestone: world-class academic AI research running on decentralized cloud infrastructure, at scale, with reproducibility that traditional compute solutions struggle to match.

Press enter or click to view image in full size

The two papers — PIGReward and P-Check — tackle one of the hardest open problems in modern AI: how do you build a model that doesn’t just satisfy an average user, but adapts to the unique preferences of each individual?

Paper 1: PIGReward

Personalized Reward Modeling for Text-to-Image Generation

The Problem It Solves

AI image generators like Stable Diffusion or DALL·E can produce stunning visuals — but whether a generated image is actually good depends entirely on who is looking at it. One person values realism; another wants vibrant color; another prioritizes minimalist composition. Standard reward models evaluate images against a single universal rubric, missing the rich diversity of individual taste.

What PIGReward Does

PIGReward introduces a personalized reward model for text-to-image generation with two core innovations:

Critically, PIGReward addresses the cold-start problem of personalization — what do you do when a user has very little history? Its self-bootstrapping strategy constructs a rich user context from just a small number of reference images, enabling personalization without retraining the model for each user.

Beyond scoring, PIGReward also generates personalized feedback that can drive prompt optimization — directly improving what the model generates next for that specific user.

🔑 Key Insight

PIGReward reframes image quality evaluation as a personalized, reasoning-driven process — not a one-size-fits-all metric. This is the difference between a rating system and a genuinely intelligent creative collaborator.

Paper 2: P-Check

Advancing Personalized Reward Model via Learning to Generate Dynamic Checklist

The Problem It Solves

Large Language Models (LLMs) are increasingly deployed as personal AI assistants. But the reward models used to align their behavior — trained on global, averaged preference data — don’t reflect how different users actually judge quality. The same response can be great for one person and completely miss the mark for another.

Existing personalized reward approaches treat each user’s context as a static persona: a fixed description inferred from their history. This misses two key dynamics: what concretely drives a user’s judgment in a specific context, and how those drivers shift from task to task.

Press enter or click to view image in full size

What P-Check Does

P-Check introduces a plug-and-play checklist generator that dynamically creates query-specific evaluation criteria drawn from each user’s interaction history. Rather than a static persona, the judge receives a live checklist — explicit, actionable criteria tuned to both the user and the current task.

This mirrors how humans actually evaluate: we don’t apply the same rubric to every situation. Judging code quality, essay style, and recipe suggestions each requires a different lens.

The Training Innovation: Preference-Contrastive Criterion Weighting

Simply distilling checklists from annotated preference pairs produces generic criteria that mix objective quality with subjective taste. P-Check solves this with a two-step training strategy:

The result: P-Check consistently outperforms existing personalized reward models across multiple benchmarks, including out-of-distribution settings. Its checklist outputs also serve as direct verbal feedback to the generator — enabling lightweight personalization without updating any model parameters.

🔑 Key Insight

P-Check shows that personalization isn’t just about knowing who the user is — it’s about dynamically understanding what they care about right now, for this specific task. That distinction is what separates genuinely personalized AI from a system that merely remembers your name.

Why This Matters for Theta Network

Both papers were trained and validated using AWS Trainium Trn instances deployed through Theta EdgeCloud Hybrid — making this a direct demonstration of what our infrastructure enables at the frontier of AI research.

Three Industry Firsts Behind This Work

Personalized reward modeling requires training on large-scale preference datasets with millions of simulated user interactions. The compute demands are substantial — and reproducibility is everything in academic research. AWS Trainium on Theta EdgeCloud delivered both: high-performance training at cost efficiency that traditional cloud infrastructure cannot match, with the deterministic, reproducible results that peer-reviewed research demands.

“Theta EdgeCloud has been an integral part of our research infrastructure over the past year. With the addition of AWS Trainium, we can now scale our experiments faster, more efficiently, and with greater reproducibility. This enables us to push the boundaries of conversational AI and recommendation systems in ways that were previously not practical.”
- Professor Dongha Lee, Yonsei University

“Yonsei University’s adoption of AWS Trainium on Theta EdgeCloud Hybrid is a perfect example of how decentralized blockchain infrastructure and cutting-edge AI silicon can work hand-in-hand to accelerate world-class research.”
- Mitch Liu, Co-founder and CEO, Theta Labs

The Bigger Picture

PIGReward and P-Check are not just academic papers. They represent a vision for the future of AI: systems that learn individual human preferences with precision, adapt dynamically to each interaction, and provide transparent, reasoned evaluations rather than black-box scores.

That future requires infrastructure that is performant, cost-accessible, and reproducible at scale. Theta EdgeCloud — powered by AWS Trainium — is built to be exactly that for the global AI research community.

As AI moves from general-purpose to deeply personalized, the infrastructure it runs on matters more than ever. We’re proud to be the platform that helped bring these breakthroughs to life — and we’re just getting started.

🔗 Read the Papers

PIGReward: https://arxiv.org/abs/2511.19458

P-Check: https://arxiv.org/abs/2601.02986 | Code: https://github.com/tommyEzreal/P-Check_

Yonsei Data & Language Intelligence Lab: https://diyonsei.notion.site/

Theta EdgeCloud: https://www.thetaedgecloud.com

This article was originally published on Blockchain Tag and is republished here under RSS syndication for informational purposes. All rights and intellectual property remain with the original author. If you are the author and wish to have this article removed, please contact us at [email protected].

NexaPay — Accept Card Payments, Receive Crypto

No KYC · Instant Settlement · Visa, Mastercard, Apple Pay, Google Pay

Get Started →