Project Phillip: Revolutionizing AI-Powered Image Creation
What is Project Phillip
Why Did We Decide to Make It
How Does It Work
Our Journey
Future and Goals

Team Engineers @ UCR:

Michael Chen

•

Freddy Song

•

QuarterShotofEspresso

•

Peter Lu

•

Xianghao Kong

Published July 31, 2024 © GPL3+

PHiLIP: Personalized Human in Loop Image Production

AI-powered image generation & enhancement suite: Create, customize & transform images with cutting-edge ML models.

IntermediateWork in progressOver 4 days268

AMD University Program Award

Pervasive AI Developer Contest

PHiLIP: Personalized Human in Loop Image Production

Things used in this project

Software apps and online services

AMD ROCm™ Software

Story

Project Phillip: Revolutionizing AI-Powered Image Creation

What is Project Phillip?

Project Phillip is an advanced AI-powered image generation and enhancement suite. It combines cutting-edge machine learning models with a user-friendly interface to democratize access to high-quality AI-generated imagery.

Key Features:

Text-to-Image Generation: Create images from textual descriptions.
Image Enhancement: Apply various styles and improve image quality.
User-Guided Creation: Iterative refinement process for precise results.
Efficient Model Management: Dynamically switch between AI models.

Why Did We Decide to Make It?

Our team, comprising PhD students in Machine Learning, undergraduates, and AI enthusiasts, saw a gap between advanced AI capabilities and accessible tools for creatives. We wanted to:

Bridge the gap between complex AI models and user-friendly applications.
Empower artists, designers, and content creators with AI-assisted tools.
Explore the potential of AMD's cloud infrastructure and Instinct MI210 GPUs in AI applications.
Create a platform for collaborative learning and innovation in AI.

How Does It Work?

Text-to-Image Generation:

Users input a text description.
Our fine-tuned PixArt-alpha models process the text.
The system speedily generates multiple low-quality guidance image options based on the description (≈2 seconds/image)

Website

Human

City

Image Enhancement:

Users can upload existing images or select previously generated ones.
They choose from various enhancement options (e.g., style transfer, upscaling).
The selected AI model processes the image to apply the chosen enhancement (Basic enhancement ≈ 25 sec, Larger Enhancement/Style Change ≈ 55 sec).

City (ControlNet)

City (Cyberpunk)

Picnic (Anime)

User-Guided Creation:

Users can iteratively refine the generated images.
They can adjust parameters, add more descriptive text, or use reference images.
The system regenerates images based on the new inputs.

Behind the Scenes:

Our Flask-based API manages requests and routes them to the appropriate AI models.
AMD cloud infrastructure and Instinct MI210 GPUs power the computationally intensive processes.
Efficient model management system switches between different AI models as needed

Our Journey

We started as a diverse group with varying levels of AI experience, united by our fascination with the potential of AI in creative fields. The journey from concept to a working prototype was filled with challenges:

Initially struggling with cloud configurations and GPU optimizations.
Overcoming the steep learning curve of advanced AI models.
Iterating countless times to improve image quality and generation speed.

Our breakthrough moment came when we successfully generated our first coherent image from a text prompt. From there, we rapidly iterated, adding features and refining our models

Today, Project Phillip stands as a testament to collaborative learning and innovation. It represents not just a tool, but a stepping stone towards more accessible and powerful AI-assisted creativity.

Future and Goals

Looking ahead, we aim to:

Expand into video generation and editing.
Develop more advanced transfer learning techniques.
Create educational modules to share our knowledge with the wider community.

Project Phillip is more than just software; it's our contribution to democratizing AI-powered creativity and pushing the boundaries of what's possible at the intersection of technology and art.