Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Toru Ogawa

Building a Manga Dataset "Manga109" with Annotations for Multimedia Applications

May 12, 2020
Kiyoharu Aizawa, Azuma Fujimoto, Atsushi Otsubo, Toru Ogawa, Yusuke Matsui, Koki Tsubota, Hikaru Ikuta

Figure 1 for Building a Manga Dataset "Manga109" with Annotations for Multimedia Applications

Figure 2 for Building a Manga Dataset "Manga109" with Annotations for Multimedia Applications

Figure 3 for Building a Manga Dataset "Manga109" with Annotations for Multimedia Applications

Figure 4 for Building a Manga Dataset "Manga109" with Annotations for Multimedia Applications

Manga, or comics, which are a type of multimodal artwork, have been left behind in the recent trend of deep learning applications because of the lack of a proper dataset. Hence, we built Manga109, a dataset consisting of a variety of 109 Japanese comic books (94 authors and 21,142 pages) and made it publicly available by obtaining author permissions for academic use. We carefully annotated the frames, speech texts, character faces, and character bodies; the total number of annotations exceeds 500k. This dataset provides numerous manga images and annotations, which will be beneficial for use in machine learning algorithms and their evaluation. In addition to academic use, we obtained further permission for a subset of the dataset for industrial use. In this article, we describe the details of the dataset and present a few examples of multimedia processing applications (detection, retrieval, and generation) that apply existing deep learning methods and are made possible by the dataset.

* IEEE MultiMedia 2020
* 10 pages, 8 figures

Via

Access Paper or Ask Questions

Team PFDet's Methods for Open Images Challenge 2019

Oct 25, 2019
Yusuke Niitani, Toru Ogawa, Shuji Suzuki, Takuya Akiba, Tommi Kerola, Kohei Ozaki, Shotaro Sano

Figure 1 for Team PFDet's Methods for Open Images Challenge 2019

Figure 2 for Team PFDet's Methods for Open Images Challenge 2019

Figure 3 for Team PFDet's Methods for Open Images Challenge 2019

Figure 4 for Team PFDet's Methods for Open Images Challenge 2019

We present the instance segmentation and the object detection method used by team PFDet for Open Images Challenge 2019. We tackle a massive dataset size, huge class imbalance and federated annotations. Using this method, the team PFDet achieved 3rd and 4th place in the instance segmentation and the object detection track, respectively.

Via

Access Paper or Ask Questions

Chainer: A Deep Learning Framework for Accelerating the Research Cycle

Aug 01, 2019
Seiya Tokui, Ryosuke Okuta, Takuya Akiba, Yusuke Niitani, Toru Ogawa, Shunta Saito, Shuji Suzuki, Kota Uenishi, Brian Vogel, Hiroyuki Yamazaki Vincent

Figure 1 for Chainer: A Deep Learning Framework for Accelerating the Research Cycle

Figure 2 for Chainer: A Deep Learning Framework for Accelerating the Research Cycle

Figure 3 for Chainer: A Deep Learning Framework for Accelerating the Research Cycle

Figure 4 for Chainer: A Deep Learning Framework for Accelerating the Research Cycle

Software frameworks for neural networks play a key role in the development and application of deep learning methods. In this paper, we introduce the Chainer framework, which intends to provide a flexible, intuitive, and high performance means of implementing the full range of deep learning models needed by researchers and practitioners. Chainer provides acceleration using Graphics Processing Units with a familiar NumPy-like API through CuPy, supports general and dynamic models in Python through Define-by-Run, and also provides add-on packages for state-of-the-art computer vision models as well as distributed training.

* Accepted for Applied Data Science Track in KDD'19

Via

Access Paper or Ask Questions

Dynamic Manipulation of Flexible Objects with Torque Sequence Using a Deep Neural Network

Jan 29, 2019
Kento Kawaharazuka, Toru Ogawa, Juntaro Tamura, Cota Nabeshima

Figure 1 for Dynamic Manipulation of Flexible Objects with Torque Sequence Using a Deep Neural Network

Figure 2 for Dynamic Manipulation of Flexible Objects with Torque Sequence Using a Deep Neural Network

Figure 3 for Dynamic Manipulation of Flexible Objects with Torque Sequence Using a Deep Neural Network

Figure 4 for Dynamic Manipulation of Flexible Objects with Torque Sequence Using a Deep Neural Network

For dynamic manipulation of flexible objects, we propose an acquisition method of a flexible object motion equation model using a deep neural network and a control method to realize a target state by calculating an optimized time-series joint torque command. By using the proposed method, any physics model of a target object is not needed, and the object can be controlled as intended. We applied this method to manipulations of a rigid object, a flexible object with and without environmental contact, and a cloth, and verified its effectiveness.

Via

Access Paper or Ask Questions

Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects

Nov 27, 2018
Yusuke Niitani, Takuya Akiba, Tommi Kerola, Toru Ogawa, Shotaro Sano, Shuji Suzuki

Figure 1 for Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects

Figure 2 for Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects

Figure 3 for Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects

Efficient and reliable methods for training of object detectors are in higher demand than ever, and more and more data relevant to the field is becoming available. However, large datasets like Open Images Dataset v4 (OID) are sparsely annotated, and some measure must be taken in order to ensure the training of a reliable detector. In order to take the incompleteness of these datasets into account, one possibility is to use pretrained models to detect the presence of the unverified objects. However, the performance of such a strategy depends largely on the power of the pretrained model. In this study, we propose part-aware sampling, a method that uses human intuition for the hierarchical relation between objects. In terse terms, our method works by making assumptions like "a bounding box for a car should contain a bounding box for a tire". We demonstrate the power of our method on OID and compare the performance against a method based on a pretrained model. Our method also won the first and second place on the public and private test sets of the Google AI Open Images Competition 2018.

Via

Access Paper or Ask Questions

PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track

Sep 04, 2018
Takuya Akiba, Tommi Kerola, Yusuke Niitani, Toru Ogawa, Shotaro Sano, Shuji Suzuki

Figure 1 for PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track

Figure 2 for PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track

Figure 3 for PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track

Figure 4 for PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track

We present a large-scale object detection system by team PFDet. Our system enables training with huge datasets using 512 GPUs, handles sparsely verified classes, and massive class imbalance. Using our method, we achieved 2nd place in the Google AI Open Images Object Detection Track 2018 on Kaggle.

* Technical report for Open Images Challenge 2018 Object Detection Track

Via

Access Paper or Ask Questions

Object Detection for Comics using Manga109 Annotations

Mar 26, 2018
Toru Ogawa, Atsushi Otsubo, Rei Narita, Yusuke Matsui, Toshihiko Yamasaki, Kiyoharu Aizawa

Figure 1 for Object Detection for Comics using Manga109 Annotations

Figure 2 for Object Detection for Comics using Manga109 Annotations

Figure 3 for Object Detection for Comics using Manga109 Annotations

Figure 4 for Object Detection for Comics using Manga109 Annotations

With the growth of digitized comics, image understanding techniques are becoming important. In this paper, we focus on object detection, which is a fundamental task of image understanding. Although convolutional neural networks (CNN)-based methods archived good performance in object detection for naturalistic images, there are two problems in applying these methods to the comic object detection task. First, there is no large-scale annotated comics dataset. The CNN-based methods require large-scale annotations for training. Secondly, the objects in comics are highly overlapped compared to naturalistic images. This overlap causes the assignment problem in the existing CNN-based methods. To solve these problems, we proposed a new annotation dataset and a new CNN model. We annotated an existing image dataset of comics and created the largest annotation dataset, named Manga109-annotations. For the assignment problem, we proposed a new CNN-based detector, SSD300-fork. We compared SSD300-fork with other detection methods using Manga109-annotations and confirmed that our model outperformed them based on the mAP score.

* http://www.manga109.org/en/

Via

Access Paper or Ask Questions

ChainerCV: a Library for Deep Learning in Computer Vision

Aug 28, 2017
Yusuke Niitani, Toru Ogawa, Shunta Saito, Masaki Saito

Figure 1 for ChainerCV: a Library for Deep Learning in Computer Vision

Figure 2 for ChainerCV: a Library for Deep Learning in Computer Vision

Figure 3 for ChainerCV: a Library for Deep Learning in Computer Vision

Figure 4 for ChainerCV: a Library for Deep Learning in Computer Vision

Despite significant progress of deep learning in the field of computer vision, there has not been a software library that covers these methods in a unifying manner. We introduce ChainerCV, a software library that is intended to fill this gap. ChainerCV supports numerous neural network models as well as software components needed to conduct research in computer vision. These implementations emphasize simplicity, flexibility and good software engineering practices. The library is designed to perform on par with the results reported in published papers and its tools can be used as a baseline for future research in computer vision. Our implementation includes sophisticated models like Faster R-CNN and SSD, and covers tasks such as object detection and semantic segmentation.

* Accepted to ACM MM 2017 Open Source Software Competition

Via

Access Paper or Ask Questions