"Anime Generation with AI".
- Video: Generated Anime: https://youtu.be/X9j1fwexK2c
- Video: Other AI Solutions for Anime Production Issues: https://youtu.be/Gz90H1M7_u4
2. Anime Inbetweening with SPGAN
Anime Generation with PSGAN
AGENDA
Recent Progress on Image Generation and Issues
Conclusion
- Anime Generation with AI -
Overview
Our Approaches and Contributions
Other AI Solutions for Anime Production Issues
3. Anime Inbetweening with SPGAN
Anime Generation with PSGAN
AGENDA
Recent Progress on Image Generation and Issues
Conclusion
- Anime Generation with AI -
Overview
Our Approaches and Contributions
Other AI Solutions for Anime Production Issues
4. AI Generated Anime Inbetweening on Zunda Horizon* Test Data**
Input Frames AI Generated Frames
(x16 Generation)
AI anime inbetweening for a wide variety of objects
**Trained using DeNA Dataset which does not include “Zunda Horizon” data*SSS STL WAO 2017 “Zunda Horizon”
Generated Anime: https://youtu.be/X9j1fwexK2c?t=4
5. AI Generated Anime Inbetweening on Zunda Horizon* Test Data**
AI Generated Frames
(x4 Generation)
Actual Frames
drawn by Human Animators
(x4 Generation)
Input Frames
Comparison with human animators
**Trained using DeNA Dataset which does not include “Zunda Horizon” data*SSS STL WAO 2017 “Zunda Horizon”
Generated Anime: https://youtu.be/X9j1fwexK2c?t=80
6. Anime Generation with DeNA AI
Generated Anime
4x4
4x4
1024x1024
4x4
4x4
Latent
Real Condition
1024x1024 1024x1024
1024x1024
512x512
512x512
1024x1024
Add animation to the new character
by imposing a pose sequence
Generation of brand new characters
Progressive Structure-conditional GANs (PSGAN)
Generated
Full-body High-resolution Anime Generation with Progressive Structure-conditional Generative Adversarial Networks.
Koichi Hamada, Kentaro Tachibana, Tianqi Li, Hiroto Honda, and Yusuke Uchida. In ECCV Workshop 2018.
Successful image generation of full-body and high-resolution characters
Diverse characters and anime generation
https://youtu.be/X9j1fwexK2c?t=104Generated Anime:
7. Successful video interpolation between frames with large structural movement
Anime inbetweening
Anime Generation with DeNA AI
Input SPGAN
(Ours)
SOTA model
Deep Voxel Flow
Structure-consistent Prediction GANs (SPGAN)
Deep Voxel Flow
(ICCV’17)
Deep Voxel Flow
(ICCV'17)
PSNR
SSIM
Structured displacement
SPGAN
(Ours)
SPGAN
(Ours)
Structural displacementStructural displacement
Challenges toward Anime Generation with Deep Generative Models.
Koichi Hamada and Tianqi Li. In DeNA TechCon 2019. https://youtu.be/X9j1fwexK2c?t=132Generated Anime:
8. Successful image generation with detailed textures for each structural element
Anime generation with a few images
Anime Generation with DeNA AI
Structural Feature-embedding GANs (SFGAN)
Rough designation
(Structure designation)
Generated
result
Rough designation
(Structure designation)
Generated
result
Image
(1 frame)
Image
(1 frame)
https://youtu.be/X9j1fwexK2c?t=166Generated Anime:
9. Successful landscape generation with designated detailed texture for each part
Background art generation
Anime Generation with DeNA AI
Structural Feature-embedding GANs (SFGAN)
Image (1 frame)
SFGAN
(Ours)Layout
SoTA model
SPADE (CVPR’19)
Generated Result
10. Successful colorization that exactly reflects color example and line details
Exact colorization based on colorized example
Anime Generation with DeNA AI
Colorized example (1 frame) Colorized resultLines Rough
Structural Feature-embedding GANs (SFGAN)
https://youtu.be/X9j1fwexK2c?t=191Generated Anime:
12. In 2018, AI generates high-quality images hard to distinguish from real photos
Anime Generation
High-quality image generation
with complex structures
Progressive Structure-conditional GANs
(PSGAN)
Our
Solution
Challenge
Back-
ground
Structure-Aware Generative Learning
Successful generation of diverse characters and animations
Structure-consistent Prediction GANs
(SPGAN)
Progressive Structure-conditional GANs
(PSGAN)Structure-Aware Generative Learning
Anime Inbetweening
Interpolation between frames
with large structural movement
Anime Generation with DeNA AI
13. We will talk about:
- Progress and challenges in cutting-edge image generation
- A solution by DeNA AI’s Structure-Aware Generative Learning
Anime Generation with DeNA AI
14. 2010 – ML for Games
Launched ML group at DeNA. Applied ML/DM to improve games
2011 – ML for the Gaming Platform ‘Mobage’ (51 million users)
Developed dozens of distributed ML systems for a game platform
2014 – present: ML for All Services at DeNA
Develop ML systems for a wide range of services
Social Network MangaGaming Platform ChatbotNews
Launched the Group for Machine Learning (ML) at DeNA in 2010
Have developed a broad range of services utilizing ML for over 9 years
Examples:
Koichi Hamada (@hamadakoichi)
15. Ph.D. in Theoretical Physics
(Quantum and Statistical Field Theory)
Book:
“Technologies that support the large-scale
social gaming platform Mobage”
(Best Book Award in CEDEC 2014)
Founder - TokyoWebmining Community (February 2010)
- Objective: Expand the fields of practical applications of Machine Learning
- 1,500 registered participants with over 60 organized meet-ups
Launched the Group for Machine Learning (ML) at DeNA in 2010
Have developed a broad range of services utilizing ML for over 9 years
Koichi Hamada (@hamadakoichi)
16. 5. Design Interfaces
Service
Front End
2. Design User Experiences
4. Design Auto Refinement Cycles
3. Design Services
7. Design Distributed Algorithms
8. Implement Distributed Algorithms
Distributed
Back End
YARN
HDFS
GPU
1. Research and Develop AI Models
New Valuable User Experiences
My activities:
Have designed and developed
new valuable experiences and services utilizing AI
Koichi Hamada (@hamadakoichi)
6. Design Logging
17. Designs, Implementations, Demonstration Experiments, Research Paper Publications,
Practical Applications, and Project Promotion
Generated Anime
Koichi Hamada (@hamadakoichi)
AI Anime Generation Project
AI Development and Project Lead
AI Generated Results for Anime Inbetweening on “Zunda Horizon*” Test Data**: https://youtu.be/X9j1fwexK2c?t=4
Input Frames AI Generated Frames
(x16 Generation)
**Trained using DeNA Dataset which does not include “Zunda Horizon” data*SSS STL WAO 2017 “Zunda Horizon”
18. Anime Inbetweening with SPGAN
Anime Generation with PSGAN
AGENDA
Recent Progress on Image Generation and Issues
Conclusion
- Anime Generation with AI -
Overview
Our Approaches and Contributions
Other AI Solutions for Anime Production Issues
20. 1 3 5 7
2 4 6 8
Question: Which image was generated by AI?
21. Question: Which image was generated by AI?
AI
Real Photos
1 3 5 7
2 4 6 8
Answer: All of the top images
22. Question: Which image was generated by AI?
Answer: All of the top images
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
1 3 5 7
2 4 6 8
AI
Real Photos
23. High-resolution and high-quality image generation by AI
AI-generated images have become higher resolution and quality
and are harder to distinguish from real photos
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
24. High-resolution and high-quality image generation by AI
AI-generated images have become higher resolution and quality
and are harder to distinguish from real photos
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
25. High-resolution and high-quality image generation by AI
AI-generated images have become higher resolution and quality
and are harder to distinguish from real photos
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
27. Generator and Discriminator compete
and improve the generation quality
Generative Adversarial Networks (GANs)
Discriminator: classifies the input data as either real or fake
Generator: attempts to fool the Discriminator by generating realistic images
Generative Adversarial Nets.
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-
Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio.
arXiv:1406.2661. In NIPS 2014.
28. Generator and Discriminator compete
and improve the generation quality
Generative Adversarial Networks (GANs)
Minimax Objective function
Discriminator classifies
the real data as ‘real’
(Goodfellow+, NIPS2014, Deep Learning Workshop, Presentation)
Discriminator classifies
the generated data as ‘fake’
Discriminator tries to
classify correctly
(maximize)
Generator tries to fool Discriminator(minimize)
29. High-resolution and high-quality image generation with GANs
AI-generated images have become higher resolution and quality
and are harder to distinguish from real photos
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
30. High-resolution and high-quality image generation with GANs
AI-generated images have become higher resolution and quality
and are harder to distinguish from real photos
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
31. Progressive GAN (Karras+, ICLR'18)
Progressive growth of Generator and Discriminator
Stable generation of 1024 x 1024 images
Generated Images (1024X1024)
Generated Images (256x256)
Progressive Growing of GANs for Improved Quality, Stability, and Variation
Tero Karras, Timo Aila, Samuli Laine, Jaakko Lehtinen. In ICLR 2018.
34. High-resolution and high-quality image generation with GANs
AI-generated images have become higher resolution and quality
and are harder to distinguish from real photos
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
35. High-resolution and high-quality image generation with GANs
AI-generated images have become higher resolution and quality
and are harder to distinguish from real photos
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
36. BigGAN (Brock+, ICLR'19)
Class conditional image generation
Diverse and high-quality image generation using ImageNet
+ Spectral Normalization on Generator
+ Self Attention
+ Two Time Scale Update Rule
Generated Images (512x512)
+ Spectral Normalization on Discriminator
+ Projection Discriminator
SNGAN with Projection (Miyato+, ICLR’18)
SAGAN (Zhang+, 18)
BigGAN (Brock+, ICLR’19)
+ Large Batch Size (256→2048)
+ Large Channel (64→96)
+ Shared Embedding
+ Hierarchical Latent Space
+ Truncation Trick
+ Orthogonal Regularization
+ First Singular Value Clamp
+ Zero-centered Gradient Penalty
Large Scale GAN Training for High Fidelity Natural Image Synthesis.
Andrew Brock, Jeff Donahue, Karen Simonyan. arXiv:1809.11096. In ICLR 2018.
37. BigGAN (Brock+, ICLR'19)
Class conditional image generation
Diverse and high-quality image generation using ImageNet
Generator
Typical Architecture
Res Block
Architecture for ImageNet at 512x512
Large Scale GAN Training for High Fidelity Natural Image Synthesis.
Andrew Brock, Jeff Donahue, Karen Simonyan. arXiv:1809.11096. In ICLR 2019.
Generated Images (512x512)
38. BigGAN (Brock+, ICLR'19)
Large Scale GAN Training for High Fidelity Natural Image Synthesis.
Andrew Brock, Jeff Donahue, Karen Simonyan. arXiv:1809.11096. In ICLR 2019.
Generates high-fidelity and diverse images using ImageNet with 1000 classes
Generated Images (512x512)
39. BigGAN (Brock+, ICLR'19)
Large Scale GAN Training for High Fidelity Natural Image Synthesis.
Andrew Brock, Jeff Donahue, Karen Simonyan. arXiv:1809.11096. In ICLR 2019.
Generates high-fidelity and diverse images using ImageNet with 1000 classes
Generated Images (512x512)
40. BigGAN (Brock+, ICLR'19)
Large Scale GAN Training for High Fidelity Natural Image Synthesis.
Andrew Brock, Jeff Donahue, Karen Simonyan. arXiv:1809.11096. In ICLR 2019.
Generates high-fidelity and diverse images using ImageNet with 1000 classes
Generated Images (512x512)
41. BigGAN (Brock+, ICLR'19)
Large Scale GAN Training for High Fidelity Natural Image Synthesis.
Andrew Brock, Jeff Donahue, Karen Simonyan. arXiv:1809.11096. In ICLR 2019.
Generates high-fidelity and diverse images using ImageNet with 1000 classes
Generated Images (512x512)
42. BigGAN (Brock+, ICLR'19)
Generates high-fidelity and diverse images using ImageNet with 1000 classes
Generated Images (512x512)
Large Scale GAN Training for High Fidelity Natural Image Synthesis.
Andrew Brock, Jeff Donahue, Karen Simonyan. arXiv:1809.11096. In ICLR 2019.
43. BigGAN (Brock+, ICLR'19)
Large Scale GAN Training for High Fidelity Natural Image Synthesis.
Andrew Brock, Jeff Donahue, Karen Simonyan. arXiv:1809.11096. In ICLR 2019.
Generates high-fidelity and diverse images using ImageNet with 1000 classes
Generated Images (512x512)
44. BigGAN (Brock+, ICLR'19)
Large Scale GAN Training for High Fidelity Natural Image Synthesis.
Andrew Brock, Jeff Donahue, Karen Simonyan. arXiv:1809.11096. In ICLR 2019.
Generates high-fidelity and diverse images using ImageNet with 1000 classes
Generated Images (512x512)
45. High-resolution and high-quality image generation with GANs
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
AI-generated images have become higher resolution and quality
and are harder to distinguish from real photos
46. High-resolution and high-quality image generation with GANs
AI-generated images have become higher resolution and quality
and are harder to distinguish from real photos
However, full-body image generation with complex structures has been a challenge
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
47. High-resolution and high-quality image generation with GANs
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
AI-generated images have become higher resolution and quality
and are harder to distinguish from real photos
However, full-body image generation with complex structures has been a challenge
48. High-resolution and high-quality image generation with GANs
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
Applications for Anime Production
Limited to some specific cases, such as
- Generation for specific body parts (e.g. Face)
- Colorization (which does not treat structural generation)
AI-generated images have become higher resolution and quality
and are harder to distinguish from real photos
However, full-body image generation with complex structures has been a challenge
49. Anime Inbetweening with SPGAN
Anime Generation with PSGAN
AGENDA
Recent Progress on Image Generation and Issues
Conclusion
- Anime Generation with AI -
Overview
Our Approaches and Contributions
Other AI Solutions for Anime Production Issues
50. Anime Generation with DeNA AI
Generated Anime
4x4
4x4
1024x1024
4x4
4x4
Latent
Real Condition
1024x1024 1024x1024
1024x1024
512x512
512x512
1024x1024
Add animation to the new character
by imposing a pose sequence
Generation of brand new characters
Progressive Structure-conditional GANs (PSGAN)
Generated
Full-body High-resolution Anime Generation with Progressive Structure-conditional Generative Adversarial Networks.
Koichi Hamada, Kentaro Tachibana, Tianqi Li, Hiroto Honda, and Yusuke Uchida. In ECCV Workshop 2018.
Successful image generation of full-body and high-resolution characters
Diverse characters and anime generation
https://youtu.be/X9j1fwexK2c?t=104Generated Anime:
51. DeNA AI: Diverse Characters and Anime Generation
https://youtu.be/bIi5gSITK0EFull-body anime generation with Progressive Structure-conditional GANs
Generated results: Brand new characters
52. DeNA AI: Diverse Characters and Anime Generation
https://youtu.be/0LQlfkvQ3OkAdding action to full-body anime characters with Progressive Structure-conditional GANs
Can add animation to the new character
by specifying a sequence of 2D poses
53. Challenges toward Anime Generation with Deep Generative Models
Koichi Hamada and Tianqi Li. In DeNA TechCon 2019.
Successful video interpolation between frames with large structural movement
Anime inbetweening
DeNA AI: Anime Inbetweening
Structure-consistent Prediction GANs (SPGAN)
Input SPGAN
(Ours)
SOTA model
Deep Voxel Flow
Deep Voxel Flow
(ICCV’17)
Deep Voxel Flow
(ICCV'17)
PSNR
SSIM
Structured displacement
SPGAN
(Ours)
SPGAN
(Ours)
Structural displacementStructural displacement
https://youtu.be/X9j1fwexK2c?t=132Generated Anime:
54. DeNA AI: Anime Inbetweening
Can inbetween frames with large structural movement
(e.g. turning around)
Inbetweening of frames
with small movement
SOTA model
Deep Voxel FlowInput
SPGAN
(Ours)
Inbetweening of frames
with large movement
https://youtu.be/vXVr64BbXHYExperimental Results: “Anime Frame Generation with Structure-consistent Prediction GANs”
https://youtu.be/X9j1fwexK2c?t=139Video:
55. DeNA AI: Anime Inbetweening
step size = 1 step size = 4 step size = 7 step size = 10
Input
SPGAN
(Ours)
SoTA
Deep Voxel
Flow
(ICCV’17)
Can inbetween frames with large structural movement
with good structural and time consistency
Small Structural displacement Large
https://youtu.be/vXVr64BbXHYExperimental Results: “Anime Frame Generation with Structure-consistent Prediction GANs”
https://youtu.be/X9j1fwexK2c?t=150Video:
56. DeNA AI: Anime Inbetweening
Deep Voxel Flow Ours
Average PSNR/SSIM on test dataset (step size=4)
PSNR SSIM
Deep Voxel Flow 23.32 0.9294
SPGAN (Ours) 24.27 0.9407
SPGAN (Ours)
Can inbetween frames with large structural movement
with good structural and time consistency
PSNR
SSIM
57. Successful image generation with detailed textures for each structural element
Anime generation with a few images
DeNA AI: Anime generation with a few images
Structural Feature-embedding GANs (SFGAN)
Rough designation
(Structure designation)
Generated
result
Rough designation
(Structure designation)
Generated
result
Image
(1 frame)
Image
(1 frame)
https://youtu.be/X9j1fwexK2c?t=166Generated Anime:
58. Successful image generation with detailed textures for each structural element
Anime generation with a few images
DeNA AI: Anime generation with a few images
Can generate diverse body type, clothing, etc. with rough designation
Structural Feature-embedding GANs (SFGAN)
Rough designation
(Structure designation)
Generated
result
Rough designation
(Structure designation)
Generated
result
Image
(1 frame)
Image
(1 frame)
https://youtu.be/X9j1fwexK2c?t=166Generated Anime:
59. Successful landscape generation with designated detailed texture for each part
Background art generation
DeNA AI: Background art generation
Structural Feature-embedding GANs (SFGAN)
Image (1 frame)
SFGAN
(Ours)Layout
SoTA model
SPADE (CVPR’19)
Generated Result
60. Successful colorization that exactly reflects color example and line details
Exact colorization based on colorized example
DeNA AI: Exact colorization
Structural Feature-embedding GANs (SFGAN)
Colorized example (1 frame) Colorized resultLines Rough
https://youtu.be/X9j1fwexK2c?t=191Generated Anime:
62. Anime Generation with DeNA AI
Structure-Aware Generative Learning
Successful generation of diverse characters and animations
63. Anime Generation with DeNA AI
Our
Solution
Challenge
Back-
ground
Structure-Aware Generative Learning
Successful generation of diverse characters and animations
64. In 2018, AI generates high-quality images hard to distinguish from real photos
Anime Generation with DeNA AI
Our
Solution
Challenge
Back-
ground
Structure-Aware Generative Learning
Successful generation of diverse characters and animations
65. In 2018, AI generates high-quality images hard to distinguish from real photos
Anime Generation
High-quality image generation
with complex structures
Anime Generation with DeNA AI
Our
Solution
Challenge
Back-
ground
Structure-Aware Generative Learning
Successful generation of diverse characters and animations
66. In 2018, AI generates high-quality images hard to distinguish from real photos
Anime Generation
High-quality image generation
with complex structures
Anime Generation with DeNA AI
Our
Solution
Challenge
Back-
ground
Structure-Aware Generative Learning
Successful generation of diverse characters and animations
Anime Inbetweening
Interpolation between frames
with large structural movement
67. In 2018, AI generates high-quality images hard to distinguish from real photos
Anime Generation
High-quality image generation
with complex structures
Progressive Structure-conditional GANs
(PSGAN)
Anime Generation with DeNA AI
Our
Solution
Challenge
Back-
ground
Structure-Aware Generative Learning
Successful generation of diverse characters and animations
Structure-consistent Prediction GANs
(SPGAN)
Progressive Structure-conditional GANs
(PSGAN)Structure-Aware Generative Learning
Anime Inbetweening
Interpolation between frames
with large structural movement
68. Anime Inbetweening with SPGAN
Anime Generation with PSGAN
AGENDA
Recent Progress on Image Generation and Issues
Conclusion
- Anime Generation with AI -
Overview
Our Approaches and Contributions
Other AI Solutions for Anime Production Issues
69. High-resolution and high-quality image generation with GANs
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
Applications for Anime Production
have been limited to some specific cases, such as
- Generation for specific body parts (e.g. Face)
- Colorization (which does not treat structural generation)
AI-generated images have become higher resolution and quality
and are harder to distinguish from real photos
However, full-body image generation with complex structures has been a challenge
70. In 2018, AI generates high-quality images hard to distinguish from real photos
Anime Generation
High-quality image generation
with complex structures
Progressive Structure-conditional GANs
(PSGAN)
Anime Generation with DeNA AI
Our
Solution
Challenge
Back-
ground
Structure-Aware Generative Learning
Successful generation of diverse characters and animations
Structure-consistent Prediction GANs
(SPGAN)
Progressive Structure-conditional GANs
(PSGAN)Structure-Aware Generative Learning
Anime Inbetweening
Interpolation between frames
with large structural movement
72. Anime Generation: Progressive Structure-conditional GANs (PSGAN) (Hamada+, ECCVW 2018)
Image generation of full-body and high-resolution characters
which has been a challenge due to its complex structure
Generated anime characters (1024x1024)
https://youtu.be/bIi5gSITK0E
Generation of brand new characters
Add animation to the new character by imposing a pose sequence
https://youtu.be/0LQlfkvQ3Ok
Full-body High-resolution Anime Generation with Progressive Structure-conditional Generative Adversarial Networks
Koichi Hamada, Kentaro Tachibana, Tianqi Li, Hiroto Honda, and Yusuke Uchida. In ECCV Workshop 2018.
73. Image generation of full-body and high-resolution characters
which has been a challenge due to its complex structure
Diverse characters and anime generation
Anime Generation: Progressive Structure-conditional GANs (PSGAN) (Hamada+, ECCVW 2018)
Generated anime characters (1024x1024)
https://youtu.be/bIi5gSITK0E
Generation of brand new characters
Add animation to the new character by imposing a pose sequence
https://youtu.be/0LQlfkvQ3Ok
Full-body High-resolution Anime Generation with Progressive Structure-conditional Generative Adversarial Networks
Koichi Hamada, Kentaro Tachibana, Tianqi Li, Hiroto Honda, and Yusuke Uchida. In ECCV Workshop 2018.
74. Proposed method: Progressive Structure-conditional GANs (PSGAN)
Learn to generate structure and image simultaneously
Stabilize generative learning of complex structures
by progressive network growth
4x4
4x4
1024x1024
4x4
4x4
Latent
Real Condition
1024x1024 1024x1024
Generated
1024x1024
512x512
512x512
1024x1024
75. Proposed method: Progressive Structure-conditional GANs (PSGAN)
4x4
4x4
4x4
4x4
Latent
Real Condition
4x4 4x44x4
Generated
Structure and image generation at low resolution for high-level context
76. Proposed method: Progressive Structure-conditional GANs (PSGAN)
4x4
4x4
8x8
4x4
8x8
4x4
Latent
Real Condition
8x8 8x88x8
Generated
Structure and image generation at low resolution for high-level context
Increase resolution in a step-by-step manner
to progressively learn to generate the detail structures
77. Proposed method: Progressive Structure-conditional GANs (PSGAN)
Structure and image generation at low resolution for high-level context
Increase resolution in a step-by-step manner
to progressively learn to generate the detail structures
4x4
4x4
1024x1024
4x4
4x4
Latent
Real Condition
1024x1024 1024x1024
Generated
1024x1024
512x512
512x512
1024x1024
79. Avatar Anime Dataset
A novel dataset containing diverse character images and 2D poses
Built by Unity 3D Avatar models and motions
80. Avatar Anime Dataset
Avatar Play
A novel dataset containing diverse character images and 2D poses
Built by Unity 3D Avatar models and motions
Developed utilizing 100 thousand 3D Avatar assets on the Mobage service
81. Generated Images: Progressive Structure-conditional GANs (PSGAN)
Image generation of full-body and high-resolution characters
which has been a challenge due to its complex structure
Diverse characters and anime generation
Generated anime characters (1024x1024)
https://youtu.be/bIi5gSITK0E
Generation of brand new characters
Add animation to the new character by imposing a pose sequence
https://youtu.be/0LQlfkvQ3Ok
82. Generated Images: Progressive Structure-conditional GANs (PSGAN)
https://youtu.be/bIi5gSITK0EFull-body anime generation at 1024x1024 with Progressive Structure-conditional GANs
Generated results of brand new characters
Generated Anime
(1024x1024)
83. Generated Images: Progressive Structure-conditional GANs (PSGAN)
https://youtu.be/0LQlfkvQ3OkAdding action to full-body anime characters with Progressive Structure-conditional GANs
Add animation to the new character by imposing a pose sequence
Generated Anime
(1024x1024)
84. Generated Images: Progressive Structure-conditional GANs (PSGAN)
(ICLR’18)
Structure Consistency: PSGAN’s images are more structure-consistent
85. Generated Images: Progressive Structure-conditional GANs (PSGAN)
(ICLR’18)
(NIPS’17) (NIPS’17)
Structure Consistency: PSGAN’s images are more structure-consistent
Image Quality on Pose Conditions: more detailed and high-quality
86. Generated Images: Progressive Structure-conditional GANs (PSGAN)
Generated Images
Application for realistic images:
Generation of new clothes with indicated pose
87. In 2018, AI generates high-quality images hard to distinguish from real photos
Anime Generation
High-quality image generation
with complex structures
Progressive Structure-conditional GANs
(PSGAN)
Anime Generation with DeNA AI
Our
Solution
Challenge
Back-
ground
Structure-Aware Generative Learning
Successful generation of diverse characters and animations
Structure-consistent Prediction GANs
(SPGAN)
Progressive Structure-conditional GANs
(PSGAN)Structure-Aware Generative Learning
Anime Inbetweening
Interpolation between frames
with large structural movement
88. In 2018, AI generates high-quality images hard to distinguish from real photos
Anime Generation
High-quality image generation
with complex structures
Progressive Structure-conditional GANs
(PSGAN)
Our
Solution
Challenge
Back-
ground
Structure-Aware Generative Learning
Successful generation of diverse characters and animations
Structure-consistent Prediction GANs
(SPGAN)
Progressive Structure-conditional GANs
(PSGAN)Structure-Aware Generative Learning
Anime Inbetweening
Interpolation between frames
with large structural movement
Anime Generation with DeNA AI
89. Anime Inbetweening with SPGAN
Anime Generation with PSGAN
AGENDA
Recent Progress on Image Generation and Issues
Conclusion
- Anime Generation with AI -
Overview
Our Approaches and Contributions
Other AI Solutions for Anime Production Issues
92. • Limited time
• Limited budget
• Limited human resources
• Quality demands
Anime Inbetweening
Problems faced in Anime Production
Animator Web Report (in Japanese) (http://animatorweb.jp/)
95. Anime Inbetweening
original 1 original 2inbetweened
frames
*
Anime inbetweening
by animators
*SSS STL WAO 2017 “Zunda Horizon”
Animator Web Report (in Japanese) (http://animatorweb.jp/)
96. Anime Inbetweening
• Animators have to draw 3500-4000 inbetweens per one 30-
min anime episode
*
Anime inbetweening
by animators
original 1 original 2inbetweened
frames
*SSS STL WAO 2017 “Zunda Horizon”
Animator Web Report (in Japanese) (http://animatorweb.jp/)
97. Anime Inbetweening
• Animators have to draw 3500-4000 inbetweens per one 30-
min anime episode
• So much effort - hours to draw one inbetween
*
Anime inbetweening
by animators
original 1 original 2inbetweened
frames
*SSS STL WAO 2017 “Zunda Horizon”
Animator Web Report (in Japanese) (http://animatorweb.jp/)
98. Anime Inbetweening
*
• Animators have to draw 3500-4000 inbetweens per one 30-
min anime episode
• So much effort - hours to draw one inbetween
original 1 original 2inbetweened
frames
*SSS STL WAO 2017 “Zunda Horizon”
Animator Web Report (in Japanese) (http://animatorweb.jp/)
Anime inbetweening
by animators
106. State-of-the-art frame interpolation method
Frame Interpolation
Super SloMo (Jiang+, CVPR’18)
Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation.
Huaizu Jiang, Deqing Sun, Varun Jampani, Ming-Hsuan Yang, Erik Learned-Miller, Jan Kautz. In CVPR 2018.
107. State-of-the-art frame interpolation method
Frame Interpolation
Super SloMo (Jiang+, CVPR’18)
• Infer the intermediate frame from the input frame sequences at 30/60FPS
-> generate 240/480FPS (x8) video
Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation.
Huaizu Jiang, Deqing Sun, Varun Jampani, Ming-Hsuan Yang, Erik Learned-Miller, Jan Kautz. In CVPR 2018.
https://youtu.be/MjViy6kyiqs
Research at NVIDIA: Transforming Standard Video Into Slow Motion with AI
108. State-of-the-art frame interpolation method
Frame Interpolation
Deep Voxel Flow (Liu+, ICCV’17)
Video Frame Synthesis using Deep Voxel Flow.
Ziwei Liu, Raymond A. Yeh, Xiaoou Tang, Yiming Liu, Aseem Agarwala. In ICCV 2017.
109. State-of-the-art frame interpolation method
Frame Interpolation
Deep Voxel Flow (Liu+, ICCV’17)
• Generate a 60fps video out of a 30FPS video
Video Frame Synthesis using Deep Voxel Flow.
Ziwei Liu, Raymond A. Yeh, Xiaoou Tang, Yiming Liu, Aseem Agarwala. In ICCV 2017.
https://youtu.be/qNXPI01WlBU?t=30s
Video Frame Synthesis using Deep Voxel Flow
110. State-of-the-art frame interpolation method
Frame Interpolation
Super SloMo (Jiang+, CVPR’18)
• Calculate Optical Flow-> synthesize intermediate frame-> refine
• Generate a 240/480FPS video out of a 30/60FPS video
Deep Voxel Flow (Liu+, ICCV’17)
• Calculate Optical Flow-> synthesize intermediate frame
• Competitive performance as Super SloMo
Super SloMo(Adobe)
Super SloMo
Deep Voxel Flow
Video Frame Synthesis using Deep Voxel Flow. Ziwei Liu, Raymond A. Yeh, Xiaoou Tang, Yiming Liu, Aseem Agarwala. In ICCV 2017.
Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation. Huaizu Jiang, Deqing Sun, Varun
Jampani, Ming-Hsuan Yang, Erik Learned-Miller, Jan Kautz. In CVPR 2018.
Calculate Optical flow and Frame interpolation using Neural Networks
111. Frame Interpolation
Optical Flow
Vector map representing displacement (movement) of points
between two consecutive frames
High intensity = Large displacement
113. Frame Interpolation does not work for anime inbetweening
Issue : Anime Inbetweening
Frame interpolation for frames with large structural displacement is key.
n Existing methods:
High-FPS input (30 60 → 240 480)
n Anime Inbetweening:
Low-FPS input (3 7 → 12 30)
*
*SSS STL WAO 2017 “Zunda Horizon”
114. Frame Interpolation does not work for anime inbetweening
Issue : Anime Inbetweening
n Existing methods:
Photo real
n Anime Inbetweening:
Illustration
*
Video Optical-flow
Illustration style is monotone in color, low textured -> Difficult to calculate Optical Flow.
*SSS STL WAO 2017 “Zunda Horizon”
115. Frame Interpolation does not work for anime inbetweening
Issue : Anime Inbetweening
Super SloMo
Deep Voxel Flow
…
Anime Inbetweening
Photo real Illustration
high-fps input
low-fps input
116. Proposed method:
Structure-consistent Prediction GANs
(SPGAN)
Structure-Aware Generative Learning
Challenges toward Anime Generation with Deep Generative Models
Koichi Hamada and Tianqi Li. In DeNA TechCon 2019.
117. A multi-task training using structural information and optical flow
Proposed method: Structure-consistent Prediction GANs (SPGAN)
n Input frames:
Multi-task
training
- Pose Keypoints:
n Optical Flow:
n Structure Information:
- Body-part masks:
118. Proposed method: Structure-consistent Prediction GANs (SPGAN)- pipeline
Structure Information
G
I0 I4
D
Optical Flow
I1, I2, I3
(Generated)
Ground Truth
MSE
MSE
Ground TruthStructure Information
(Generated)
A multi-task training using structural information and optical flow
120. Proposed method: Structure-consistent Prediction GANs (SPGAN)- Discriminator
Conv-BN-ReLU
Conv-BN-ReLU
Conv-BN-ReLU
Conv-BN-ReLU
Local Discriminator
“Real” or “Fake”
Local Patch
(16×16pix)
Improve quality of details and time consistency
by Local Discriminator and Temporal Discriminator
121. Proposed method: Structure-consistent Prediction GANs (SPGAN)- Discriminator
Conv-BN-ReLU
Conv-BN-ReLU
Generated
Image Sequence
Conv-BN-ReLU
Conv-BN-ReLU
Conv-BN-ReLU
Conv-BN-ReLU
Conv-BN-ReLU
Conv-BN-ReLU
Conv-BN-ReLU
FC
Local Discriminator
Temporal Discriminator
“Real” or “Fake”
Local Patch
(16×16pix)
Image Sequence
“Real” or “Fake”
Improve quality of details and time consistency
by Local Discriminator and Temporal Discriminator
123. Experiment settings
Extract five consecutive frames from a video and
infer intermediate 3 frames using only the first and the last frames
Video
124. Experiment settings
Extract five consecutive frames from a video and
infer intermediate 3 frames using only the first and the last frames
image0 image1 image2 image3 image4Video
128. Generated Results: Structure-consistent Prediction GANs (SPGAN)
Proposed method can interpolate the frames having large displacement
with good structural and time consistency
interpolation of frames
with small movement
SOTA model
Deep Voxel FlowInput
SPGAN
(Ours)
interpolation of frames
with large movement
Experimental Results: “Anime Frame Generation with Structure-consistent Prediction GANs” https://youtu.be/vXVr64BbXHY
https://youtu.be/X9j1fwexK2c?t=139Video:
129. Generated Results: Structure-consistent Prediction GANs (SPGAN)
step size = 1 step size = 4 step size = 7 step size = 10
Input
SPGAN
(Ours)
SoTA
Deep Voxel Flow
(ICCV’17)
Proposed method can interpolate the frames having large displacement
with good structural and time consistency
Small Structural displacement Large
Experimental Results: “Anime Frame Generation with Structure-consistent Prediction GANs” https://youtu.be/vXVr64BbXHY
https://youtu.be/X9j1fwexK2c?t=150Video:
130. Quantitative Evaluations: Structure-consistent Prediction GANs(SPGAN)
Deep Voxel Flow Ours
Average PSNR/SSIM on test dataset (step size=4)
PSNR SSIM
Deep Voxel Flow 23.32 0.9294
SPGAN(Ours) 24.27 0.9407
SPGAN (Ours)
Proposed method can interpolate the frames having large displacement
with good structural and time consistency
131. Anime Inbetweening with SPGAN
Anime Generation with PSGAN
AGENDA
Recent Progress on Image Generation and Issues
Conclusion
- Anime Generation with AI -
Overview
Our Approaches and Contributions
Other AI Solutions for Anime Production Issues
132. Issues in anime production
Today we will discuss the following five points
Process Issues
Overall process
Handling 4K/60fps (Making large key animation default, and increasing number of
inbetween frames)
Increased difficulty of managing processes & progress due to more complex
processes (2D, 3D, etc.), and separation of animation processes
Create layout (LO) Insufficient key animation pipeline
Create key animation frames (first) Insufficient key animation pipeline
Create key animation frames (second) Insufficient key animation pipeline
Animation supervising (characters)
Greater burden on animation supervising for characters due to more complex
character design
Animation supervising (action) Insufficient key animation pipeline
Create inbetweens Reduced pipeline & higher costs from overseas production companies
Finishing Reduced pipeline & higher costs from overseas production companies
In-between check
Demands to shorten check time in response to lengthened lead time for animation
(reduced post production time)
Background art Background / art delivery timing tending to be delayed, with more shoots
3D Lack of 3D animators and increased investment in education for training
Issues
133. DeNA AI: Overall process/ Animation (Inbetweens)
Input Frames AI Generated Frames
(x16 Generation)
x16 high-quality anime inbetweening
This makes creating 4K/60FPS animation easier
AI Generated Anime Inbetweening on Zunda Horizon* Test Data**
Our Generated Results:
https://youtu.be/X9j1fwexK2c?t=4AI Generated Results for Anime Inbetweening on “Zunda Horizon*” Test Data**:
**Trained using DeNA Dataset which does not include “Zunda Horizon” data*SSS STL WAO 2017 “Zunda Horizon”
134. Character drawn in new pose by designating style and drawing 2D stick figure
Changing the 2D stick figure makes the character move in accordance with 3D structure
This makes creating key animation easier
Character designation
Structuredesignation
(2Dposeinformation)
Generated examples
DeNA AI: Key Animation Frame/ Animation (Inbetweens)
https://youtu.be/X9j1fwexK2c?t=104Our Contribution:
135. Character drawn in new pose by designating style and drawing 2D stick figure
Changing the 2D stick figure makes the character move in accordance with 3D structure
This makes creating key animation easier
DeNA AI: Key Animation Frame/ Animation (Inbetweens)
Designate 2D pose series to animate the new characterNew character generation
https://youtu.be/bIi5gSITK0E https://youtu.be/0LQlfkvQ3Ok
https://youtu.be/Gz90H1M7_u4?t=50Video:https://youtu.be/X9j1fwexK2c?t=104Our Contribution:
136. Animation generation with a few images
Animation generated by designating roughs
This makes creating key animation & inbetweens easier
Image
(1 frame)
Rough designation
(structure designation)
Image
(1 frame)
Generated
result
Rough designation
(structure designation)
Generated
result
DeNA AI: Key Animation Frame/ Animation (Inbetweens)
https://youtu.be/X9j1fwexK2c?t=166Our Contribution:
137. Image
(1 frame)
Rough designation
(structure designation)
Image
(1 frame)
Generated
result
Rough designation
(structure designation)
Generated
result
Animation generation with a few images
Animation generated by designating roughs
This makes creating key animation & inbetweens easier
DeNA AI: Key Animation Frame/ Animation (Inbetweens)
https://youtu.be/X9j1fwexK2c?t=166Our Contribution: https://youtu.be/Gz90H1M7_u4?t=70Video:
138. Automated coloring reflecting color sample and line details
This makes finishing easier
Colorized example
(1 frame)
Generated resultRoughLines
https://youtu.be/X9j1fwexK2c?t=191Our Contribution: https://youtu.be/Gz90H1M7_u4?t=80Video:
DeNA AI: Finishing (Colorization)
139. Designate layout from art designation image and generate background art
with detailed textures from each structural element
Background automatically generated by drawing layout
Can allocate more time to drawing to raise background quality
Art designation image (1 frame) Generated resultLayout
https://youtu.be/X9j1fwexK2c?t=183Our Contribution:
DeNA AI: Background Art
140. Designate layout from art designation image and generate background art
with detailed textures from each structural element
Background automatically generated by drawing layout
Can allocate more time to drawing to raise background quality
Image (1 frame)
SFGAN
(Ours)Layout
SoTA model
SPADE (CVPR’19)
Generated Result
https://youtu.be/X9j1fwexK2c?t=183Our Contribution:
DeNA AI: Background Art
141. Anime Inbetweening with SPGAN
Anime Generation with PSGAN
AGENDA
Recent Progress on Image Generation and Issues
Conclusion
- Anime Generation with AI -
Overview
Our Approaches and Contributions
Other AI Solutions for Anime Production Issues
142. Input Frames AI Generated Frames
(x16 Generation)
AI Anime Inbetweening for a wide variety of objects
**Trained using DeNA Dataset which does not include “Zunda Horizon” data*SSS STL WAO 2017 “Zunda Horizon”
Generated Anime: https://youtu.be/X9j1fwexK2c?t=4
AI Generated Anime Inbetweening on Zunda Horizon* Test Data**
143. AI Generated Anime Inbetweening on Zunda Horizon* Test Data**
Comparison with human animators
**Trained using DeNA Dataset which does not include “Zunda Horizon” data*SSS STL WAO 2017 “Zunda Horizon”
AI Generated Frames
(x4 Generation)
Actual Frames
drawn by Human Animators
(x4 Generation)
Input Frames
Generated Anime: https://youtu.be/X9j1fwexK2c?t=80
144. High-resolution and high-quality image generation with GANs
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
AI-generated images have become higher resolution and quality
and are harder to distinguish from real photos
However, full-body image generation with complex structures has been a challenge
145. High-resolution and high-quality image generation with GANs
ProgressiveGAN (Karras et al., ICLR 2018) BigGAN (Brock et al., ICLR 2019)
Applications for Anime Production
Limited to some specific cases, such as
- Generation for specific body parts (e.g. Face)
- Colorization (which does not treat structural generation)
AI-generated images have become higher resolution and quality
and are harder to distinguish from real photos
However, full-body image generation with complex structures has been a challenge
146. Anime Generation with DeNA AI
Generated Anime
4x4
4x4
1024x1024
4x4
4x4
Latent
Real Condition
1024x1024 1024x1024
1024x1024
512x512
512x512
1024x1024
Add animation to the new character
by imposing a pose sequence
Generation of brand new characters
Progressive Structure-conditional GANs (PSGAN)
Generated
Full-body High-resolution Anime Generation with Progressive Structure-conditional Generative Adversarial Networks.
Koichi Hamada, Kentaro Tachibana, Tianqi Li, Hiroto Honda, and Yusuke Uchida. In ECCV Workshop 2018.
Successful image generation of full-body and high-resolution characters
Diverse characters and anime generation
https://youtu.be/X9j1fwexK2c?t=104Generated Anime:
147. Successful video interpolation between frames with large structural movement
Anime inbetweening
Anime Generation with DeNA AI
Input SPGAN
(Ours)
SOTA model
Deep Voxel Flow
Structure-consistent Prediction GANs (SPGAN)
Deep Voxel Flow
(ICCV’17)
Deep Voxel Flow
(ICCV'17)
PSNR
SSIM
Structured displacement
SPGAN
(Ours)
SPGAN
(Ours)
Structural displacementStructural displacement
Challenges toward Anime Generation with Deep Generative Models.
Koichi Hamada and Tianqi Li. In DeNA TechCon 2019. https://youtu.be/X9j1fwexK2c?t=132Generated Anime:
148. Successful image generation with detailed textures for each structural element
Anime generation with a few images
Anime Generation with DeNA AI
Structural Feature-embedding GANs (SFGAN)
Rough designation
(Structure designation)
Generated
result
Rough designation
(Structure designation)
Generated
result
Image
(1 frame)
Image
(1 frame)
https://youtu.be/X9j1fwexK2c?t=166Generated Anime:
149. Successful landscape generation with designated detailed texture for each part
Background art generation
Anime Generation with DeNA AI
Structural Feature-embedding GANs (SFGAN)
Image (1 frame)
SFGAN
(Ours)Layout
SoTA model
SPADE (CVPR’19)
Generated Result
150. Successful colorization that exactly reflects color example and line details
Exact colorization based on colorized example
Anime Generation with DeNA AI
Colorized example (1 frame) Colorized resultLines Rough
Structural Feature-embedding GANs (SFGAN)
https://youtu.be/X9j1fwexK2c?t=191Generated Anime:
151. In 2018, AI generates high-quality images hard to distinguish from real photos
Anime Generation
High-quality image generation
with complex structures
Progressive Structure-conditional GANs
(PSGAN)
Anime Generation with DeNA AI
Our
Solution
Challenge
Back-
ground
Structure-Aware Generative Learning
Successful generation of diverse characters and animations
Structure-consistent Prediction GANs
(SPGAN)
Progressive Structure-conditional GANs
(PSGAN)Structure-Aware Generative Learning
Anime Inbetweening
Interpolation between frames
with large structural movement
152. There are great possibilities for AI-generated animation
At DeNA, we challenge ourselves to provide
new value in anime generation
We would be happy if we could work together with you
to create a better future for animation production
ai@dena.com
Please contact us
@hamadakoichi
Koichi Hamada
Anime Generation with DeNA AI