Загрузка страницы

SEAN: Image Synthesis With Semantic Region-Adaptive Normalization

Authors: Peihao Zhu, Rameen Abdal, Yipeng Qin, Peter Wonka Description: We propose semantic region-adaptive normalization (SEAN), a simple but effective building block for Generative Adversarial Networks conditioned on segmentation masks that describe the semantic regions in the desired output image. Using SEAN normalization, we can build a network architecture that can control the style of each semantic region individually, e.g., we can specify one style reference image per region. SEAN is better suited to encode, transfer, and synthesize style than the best previous method in terms of reconstruction quality, variability, and visual quality. We evaluate SEAN on multiple datasets and report better quantitative metrics (e.g. FID, PSNR) than the current state of the art. SEAN also pushes the frontier of interactive image editing. We can interactively edit images by changing segmentation masks or the style for any given region. We can also interpolate styles from two reference images per region.

Видео SEAN: Image Synthesis With Semantic Region-Adaptive Normalization канала ComputerVisionFoundation Videos
Показать
Комментарии отсутствуют
Введите заголовок:

Введите адрес ссылки:

Введите адрес видео с YouTube:

Зарегистрируйтесь или войдите с
Информация о видео
17 июля 2020 г. 9:13:46
00:04:56
Другие видео канала
Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity EstimationDisp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity EstimationHigh-Resolution Radar Dataset for Semi-Supervised Learning of Dynamic ObjectsHigh-Resolution Radar Dataset for Semi-Supervised Learning of Dynamic ObjectsLearning to Dress 3D People in Generative ClothingLearning to Dress 3D People in Generative ClothingLearning Physics-Guided Face Relighting Under Directional LightLearning Physics-Guided Face Relighting Under Directional LightOrthogonal Convolutional Neural NetworksOrthogonal Convolutional Neural Networks232 - Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding232 - Improving Video Captioning with Temporal Composition of a Visual-Syntactic EmbeddingMatch or No Match: Keypoint Filtering Based on Matching ProbabilityMatch or No Match: Keypoint Filtering Based on Matching ProbabilityDeepLPF: Deep Local Parametric Filters for Image EnhancementDeepLPF: Deep Local Parametric Filters for Image Enhancement324 - Weakly Supervised Deep Reinforcement Learning for Video Summarization With Semantically Meani324 - Weakly Supervised Deep Reinforcement Learning for Video Summarization With Semantically MeaniNeural Architecture Search for Lightweight Non-Local NetworksNeural Architecture Search for Lightweight Non-Local NetworksInverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a...Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a...High-Frequency Component Helps Explain the Generalization of Convolutional Neural NetworksHigh-Frequency Component Helps Explain the Generalization of Convolutional Neural Networks1257 - Multimodal Prototypical Networks for Few-shot Learning1257 - Multimodal Prototypical Networks for Few-shot Learning1369 - CenterFusion:Center-based Radar and Camera Fusionfor 3D Object Detection1369 - CenterFusion:Center-based Radar and Camera Fusionfor 3D Object Detection515 - Cinematic-L1 Video Stabilization with a Log-Homography Model515 - Cinematic-L1 Video Stabilization with a Log-Homography Model71 - DeepCSR: A 3D Deep Learning Approach For Cortical Surface Reconstruction71 - DeepCSR: A 3D Deep Learning Approach For Cortical Surface ReconstructionThrough the Looking Glass: Neural 3D Reconstruction of Transparent ShapesThrough the Looking Glass: Neural 3D Reconstruction of Transparent ShapesRethinking Zero-Shot Video Classification: End-to-End Training for Realistic ApplicationsRethinking Zero-Shot Video Classification: End-to-End Training for Realistic Applications12-in-1: Multi-Task Vision and Language Representation Learning12-in-1: Multi-Task Vision and Language Representation LearningEnd-to-End Camera Calibration for Broadcast VideosEnd-to-End Camera Calibration for Broadcast Videos653 - Misclassification Risk and Uncertainty Quantification in Deep Classifiers653 - Misclassification Risk and Uncertainty Quantification in Deep Classifiers
Яндекс.Метрика