I am a PhD student at the Computer Graphics Laboratory at ETH Zurich in cooperation with DisneyResearch|Studios since May 2021. I am supervised by Prof. Markus Gross and Dr. Romann M. Weber. My research interests include various areas of deep learning and especially generative modeling. I always find it most fascinating and satisfying if I’m working on projects that interact with the environment in some way, i.e., where I can see/hear/feel its results. In recent years, I focused mostly on face-related image and video generation tasks.
Before starting my PhD studies, I completed my master’s degree in Robotics, Cognition, Intelligence at the Technical University of Munich. Thereby, I became really interested in deep learning and had the privilege to work on my master’s thesis and several other projects in Prof. Laura Leal-Taixé’s Dynamic Vision and Learning Group. Prior to that, I did my bachelor’s degree in IT-Automotive at the Baden-Wuerttemberg Cooperative State University (DHBW) Stuttgart in cooperation with Robert Bosch GmbH.
Preprint
We propose motion-textual inversion, a general method to transfer the semantic motion of a given reference motion video to given target images. We thereby optimize a motion representation composed of a set of text/image embedding tokens using a frozen, pre-trained image-to-video diffusion model. Our method generalizes across various domains and supports multiple types of motions, including full-body, face, camera, and even hand-crafted motions.
Manuel Kansy (DisneyResearch|Studios / ETH Zurich), Jacek Naruniec (DisneyResearch|Studios), Christopher Schroers (DisneyResearch|Studios), Markus Gross (DisneyResearch|Studios / ETH Zurich), Romann M. Weber (DisneyResearch|Studios)
Preprint
We show that applying classifier-free guidance (CFG) does not require any specific training procedure (e.g., inserting a null condition during training), and CFG can be extended to a more general method that is applicable to any diffusion model, including unconditional ones.
Seyedmorteza Sadat (DisneyResearch|Studios / ETH Zurich), Manuel Kansy (DisneyResearch|Studios / ETH Zurich), Otmar Hilliges (ETH Zurich), Romann M. Weber (DisneyResearch|Studios)
IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2023
We tackle the challenging task of inverting the latent space of pre-trained face recognition models without full model access (i.e. black-box setting). Our method, the identity denoising diffusion probabilistic model (ID3PM), leverages the stochastic nature of the denoising diffusion process to produce high-quality, identity-preserving face images with various backgrounds, lighting, poses, and expressions.
Manuel Kansy (DisneyResearch|Studios / ETH Zurich), Anton Raël (ETH Zurich), Graziana Mignone (DisneyResearch|Studios), Jacek Naruniec (DisneyResearch|Studios), Christopher Schroers (DisneyResearch|Studios), Markus Gross (DisneyResearch|Studios / ETH Zurich), Romann M. Weber (DisneyResearch|Studios)
Project Page — Paper — Supplementary Material
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, 2023
The terms high-resolution and high-quality are not equivalent, and high-resolution does not always imply high-quality. In this paper, we motivate and precisely define the concept of effective resolution and propose a novel self-supervised learning scheme to train a neural network for effective resolution estimation. We demonstrate that our method outperforms state-of-the-art image quality assessment methods in estimating the sharpness of real and generated human faces, despite using only unlabeled data during training.
Manuel Kansy (DisneyResearch|Studios / ETH Zurich), Julian Balletshofer (DisneyResearch|Studios), Jacek Naruniec (DisneyResearch|Studios), Christopher Schroers (DisneyResearch|Studios), Graziana Mignone (DisneyResearch|Studios), Markus Gross (DisneyResearch|Studios / ETH Zurich), Romann M. Weber (DisneyResearch|Studios)
Project Page — Video — Paper — Supplementary Material
Master’s Thesis (2020 - 2021)
Used technologies: Python, PyTorch, Ubuntu
Course: Master Practical (2019 - 2020)
Used technologies: Python, PyTorch, Ubuntu
Course: Reinforcement Learning for Robotics (2019 - 2020)
Used technologies: MATLAB, Simulink, MacOS
Course: Advanced Deep Learning for Computer Vision (2019)
Used technologies: Python, PyTorch, Ubuntu
(2015 - 2018)
Used technologies: Simulink, MATLAB, C++, C#, Unity, Windows, MacOS, iOS, Embedded Linux, Vehicle Deployment
If you are an ETH student and are interested in doing your thesis in our group, feel free to reach out to me via email.
Master’s Thesis (Fall 2024)
Master’s Thesis (Fall 2023)
Master’s Thesis (Fall 2023)
Semester Thesis (Spring 2023)
Bachelor’s Thesis (Spring 2023)
Master’s Thesis (Spring 2023)
Master’s Thesis (Fall 2022)
Bachelor’s Thesis (Fall 2022)
Bachelor’s Thesis (Spring 2022)
Bachelor’s Thesis (Spring 2022)
Master’s Thesis (Spring 2022)
Backoffice TA (Fall 2024)
Head TA (Fall 2023)
Head TA (Fall 2023)
Head TA (Fall 2022)
Head TA (Fall 2022)
Regular TA (Spring 2022)
Regular TA (Fall 2021)
Student organization (2016 - 2018)