Models, code, and papers for "Sameh K":
Knowledge graphs have attracted lots of attention in academic and industrial environments. Despite their usefulness, popular knowledge graphs suffer from incompleteness of information, especially in their type assertions. This has encouraged research in the automatic discovery of entity types. In this context, multiple works were developed to utilize logical inference on ontologies and statistical machine learning methods to learn type assertion in knowledge graphs. However, these approaches suffer from limited performance on noisy data, limited scalability and the dependence on labeled training samples. In this work, we propose a new unsupervised approach that learns to categorize entities into a hierarchy of named groups. We show that our approach is able to effectively learn entity groups using a scalable procedure in noisy and sparse datasets. We experiment our approach on a set of popular knowledge graph benchmarking datasets, and we publish a collection of the outcome group hierarchies.
Breast cancer is the most common cancer and is the leading cause of cancer death among women worldwide. Detection of breast cancer, while it is still small and confined to the breast, provides the best chance of effective treatment. Computer Aided Detection (CAD) systems that detect cancer from mammograms will help in reducing the human errors that lead to missing breast carcinoma. Literature is rich of scientific papers for methods of CAD design, yet with no complete system architecture to deploy those methods. On the other hand, commercial CADs are developed and deployed only to vendors' mammography machines with no availability to public access. This paper presents a complete CAD; it is complete since it combines, on a hand, the rigor of algorithm design and assessment (method), and, on the other hand, the implementation and deployment of a system architecture for public accessibility (system). (1) We develop a novel algorithm for image enhancement so that mammograms acquired from any digital mammography machine look qualitatively of the same clarity to radiologists' inspection; and is quantitatively standardized for the detection algorithms. (2) We develop novel algorithms for masses and microcalcifications detection with accuracy superior to both literature results and the majority of approved commercial systems. (3) We design, implement, and deploy a system architecture that is computationally effective to allow for deploying these algorithms to cloud for public access.
This paper proposes a scheme to efficiently execute distributed learning tasks in an asynchronous manner while minimizing the gradient staleness on wireless edge nodes with heterogeneous computing and communication capacities. The designed approach considered in this paper ensures that all devices work for a certain duration that covers the time for data/model distribution, learning iterations, model collection and global aggregation. The resulting problem is an integer non-convex program with quadratic equality constraints as well as linear equality and inequality constraints. Because the problem is NP-hard, we relax the integer constraints in order to solve it efficiently with available solvers. Analytical bounds are derived using the KKT conditions and Lagrangian analysis in conjunction with the suggest-and-improve approach. Results show that our approach reduces the gradient staleness and can offer better accuracy than the synchronous scheme and the asynchronous scheme with equal task allocation.
This paper aims to establish a new optimization paradigm for implementing realistic distributed learning algorithms, with performance guarantees, on wireless edge nodes with heterogeneous computing and communication capacities. We will refer to this new paradigm as "Mobile Edge Learning (MEL)". The problem of dynamic task allocation for MEL is considered in this paper with the aim to maximize the learning accuracy, while guaranteeing that the total times of data distribution/aggregation over heterogeneous channels, and local computing iterations at the heterogeneous nodes, are bounded by a preset duration. The problem is first formulated as a quadratically-constrained integer linear problem. Being an NP-hard problem, the paper relaxes it into a non-convex problem over real variables. We thus proposed two solutions based on deriving analytical upper bounds of the optimal solution of this relaxed problem using Lagrangian analysis and KKT conditions, and the use of suggest-and-improve starting from equal batch allocation, respectively. The merits of these proposed solutions are exhibited by comparing their performances to both numerical approaches and the equal task allocation approach.
This paper presents novel approaches for efficient feature extraction using environmental sound magnitude spectrogram. We propose approach based on the visual domain. This approach included three methods. The first method is based on extraction for each spectrogram a single log-Gabor filter followed by mutual information procedure. In the second method, the spectrogram is passed by the same steps of the first method but with an averaged bank of 12 log-Gabor filter. The third method consists of spectrogram segmentation into three patches, and after that for each spectrogram patch we applied the second method. The classification results prove that the second method is the most efficient in our environmental sound classification system.
We study the maximum mean discrepancy (MMD) in the context of critical transitions modelled by fast-slow stochastic dynamical systems. We establish a new link between the dynamical theory of critical transitions with the statistical aspects of the MMD. In particular, we show that a formal approximation of the MMD near fast subsystem bifurcation points can be computed to leading-order. In particular, this leading order approximation shows that the MMD depends intricately on the fast-slow systems parameters and one can only expect to extract warning signs under rather stringent conditions. However, the MMD turns out to be an excellent binary classifier to detect the change point induced by the critical transition. We cross-validate our results by numerical simulations for a van der Pol-type model.
One major bottleneck in the practical implementation of received signal strength (RSS) based indoor localization systems is the extensive deployment efforts required to construct the radio maps through fingerprinting. In this paper, we aim to design an indoor localization scheme that can be directly employed without building a full fingerprinted radio map of the indoor environment. By accumulating the information of localized RSSs, this scheme can also simultaneously construct the radio map with limited calibration. To design this scheme, we employ a source data set that possesses the same spatial correlation of the RSSs in the indoor environment under study. The knowledge of this data set is then transferred to a limited number of calibration fingerprints and one or several RSS observations with unknown locations, in order to perform direct localization of these observations using manifold alignment. We test two different source data sets, namely a simulated radio propagation map and the environments plan coordinates. For moving users, we exploit the correlation of their observations to improve the localization accuracy. The online testing in two indoor environments shows that the plan coordinates achieve better results than the simulated radio maps, and a negligible degradation with 70-85% reduction in calibration load.
In the era of the Internet of Things (IoT), an enormous amount of sensing devices collect and/or generate various sensory data over time for a wide range of fields and applications. Based on the nature of the application, these devices will result in big or fast/real-time data streams. Applying analytics over such data streams to discover new information, predict future insights, and make control decisions is a crucial process that makes IoT a worthy paradigm for businesses and a quality-of-life improving technology. In this paper, we provide a thorough overview on using a class of advanced machine learning techniques, namely Deep Learning (DL), to facilitate the analytics and learning in the IoT domain. We start by articulating IoT data characteristics and identifying two major treatments for IoT data from a machine learning perspective, namely IoT big data analytics and IoT streaming data analytics. We also discuss why DL is a promising approach to achieve the desired analytics in these types of data and applications. The potential of using emerging DL techniques for IoT data analytics are then discussed, and its promises and challenges are introduced. We present a comprehensive background on different DL architectures and algorithms. We also analyze and summarize major reported research attempts that leveraged DL in the IoT domain. The smart IoT devices that have incorporated DL in their intelligence background are also discussed. DL implementation approaches on the fog and cloud centers in support of IoT applications are also surveyed. Finally, we shed light on some challenges and potential directions for future research. At the end of each section, we highlight the lessons learned based on our experiments and review of the recent literature.
In this paper, we design a multimodal framework for object detection, recognition and mapping based on the fusion of stereo camera frames, point cloud Velodyne Lidar scans, and Vehicle-to-Vehicle (V2V) Basic Safety Messages (BSMs) exchanged using Dedicated Short Range Communication (DSRC). We merge the key features of rich texture descriptions of objects from 2D images, depth and distance between objects provided by 3D point cloud and awareness of hidden vehicles from BSMs' 3D information. We present a joint pixel to point cloud and pixel to V2V correspondences of objects in frames from the Kitti Vision Benchmark Suite by using a semi-supervised manifold alignment approach to achieve camera-Lidar and camera-V2V mapping of their recognized objects that have the same underlying manifold.
Sparse representations have been successfully applied to signal processing, computer vision and machine learning. Currently there is a trend to learn sparse models directly on structure data, such as region covariance. However, such methods when combined with region covariance often require complex computation. We present an approach to transform a structured sparse model learning problem to a traditional vectorized sparse modeling problem by constructing a Euclidean space representation for region covariance matrices. Our new representation has multiple advantages. Experiments on several vision tasks demonstrate competitive performance with the state-of-the-art methods.
This paper presents StereoNet, the first end-to-end deep architecture for real-time stereo matching that runs at 60 fps on an NVidia Titan X, producing high-quality, edge-preserved, quantization-free disparity maps. A key insight of this paper is that the network achieves a sub-pixel matching precision than is a magnitude higher than those of traditional stereo matching approaches. This allows us to achieve real-time performance by using a very low resolution cost volume that encodes all the information needed to achieve high disparity precision. Spatial precision is achieved by employing a learned edge-aware upsampling function. Our model uses a Siamese network to extract features from the left and right image. A first estimate of the disparity is computed in a very low resolution cost volume, then hierarchically the model re-introduces high-frequency details through a learned upsampling function that uses compact pixel-to-pixel refinement networks. Leveraging color input as a guide, this function is capable of producing high-quality edge-aware output. We achieve compelling results on multiple benchmarks, showing how the proposed method offers extreme flexibility at an acceptable computational budget.
We propose a novel efficient and lightweight model for human pose estimation from a single image. Our model is designed to achieve competitive results at a fraction of the number of parameters and computational cost of various state-of-the-art methods. To this end, we explicitly incorporate part-based structural and geometric priors in a hierarchical prediction framework. At the coarsest resolution, and in a manner similar to classical part-based approaches, we leverage the kinematic structure of the human body to propagate convolutional feature updates between the keypoints or body parts. Unlike classical approaches, we adopt end-to-end training to learn this geometric prior through feature updates from data. We then propagate the feature representation at the coarsest resolution up the hierarchy to refine the predicted pose in a coarse-to-fine fashion. The final network effectively models the geometric prior and intuition within a lightweight deep neural network, yielding state-of-the-art results for a model of this size on two standard datasets, Leeds Sports Pose and MPII Human Pose.
We explore total scene capture -- recording, modeling, and rerendering a scene under varying appearance such as season and time of day. Starting from internet photos of a tourist landmark, we apply traditional 3D reconstruction to register the photos and approximate the scene as a point cloud. For each photo, we render the scene points into a deep framebuffer, and train a neural network to learn the mapping of these initial renderings to the actual photos. This rerendering network also takes as input a latent appearance vector and a semantic mask indicating the location of transient objects like pedestrians. The model is evaluated on several datasets of publicly available images spanning a broad range of illumination conditions. We create short videos demonstrating realistic manipulation of the image viewpoint, appearance, and semantic labeling. We also compare results with prior work on scene reconstruction from internet photos.
In this paper we present ActiveStereoNet, the first deep learning solution for active stereo systems. Due to the lack of ground truth, our method is fully self-supervised, yet it produces precise depth with a subpixel precision of $1/30th$ of a pixel; it does not suffer from the common over-smoothing issues; it preserves the edges; and it explicitly handles occlusions. We introduce a novel reconstruction loss that is more robust to noise and texture-less patches, and is invariant to illumination changes. The proposed loss is optimized using a window-based cost aggregation with an adaptive support weight scheme. This cost aggregation is edge-preserving and smooths the loss function, which is key to allow the network to reach compelling results. Finally we show how the task of predicting invalid regions, such as occlusions, can be trained end-to-end without ground-truth. This component is crucial to reduce blur and particularly improves predictions along depth discontinuities. Extensive quantitatively and qualitatively evaluations on real and synthetic data demonstrate state of the art results in many challenging scenes.
Motivated by augmented and virtual reality applications such as telepresence, there has been a recent focus in real-time performance capture of humans under motion. However, given the real-time constraint, these systems often suffer from artifacts in geometry and texture such as holes and noise in the final rendering, poor lighting, and low-resolution textures. We take the novel approach to augment such real-time performance capture systems with a deep architecture that takes a rendering from an arbitrary viewpoint, and jointly performs completion, super resolution, and denoising of the imagery in real-time. We call this approach neural (re-)rendering, and our live system "LookinGood". Our deep architecture is trained to produce high resolution and high quality images from a coarse rendering in real-time. First, we propose a self-supervised training method that does not require manual ground-truth annotation. We contribute a specialized reconstruction error that uses semantic information to focus on relevant parts of the subject, e.g. the face. We also introduce a salient reweighing scheme of the loss function that is able to discard outliers. We specifically design the system for virtual and augmented reality headsets where the consistency between the left and right eye plays a crucial role in the final user experience. Finally, we generate temporally stable results by explicitly minimizing the difference between two consecutive frames. We tested the proposed system in two different scenarios: one involving a single RGB-D sensor, and upper body reconstruction of an actor, the second consisting of full body 360 degree capture. Through extensive experimentation, we demonstrate how our system generalizes across unseen sequences and subjects. The supplementary video is available at http://youtu.be/Md3tdAKoLGU.
Breast cancer is the most common invasive cancer in women, affecting more than 10% of women worldwide. Microscopic analysis of a biopsy remains one of the most important methods to diagnose the type of breast cancer. This requires specialized analysis by pathologists, in a task that i) is highly time- and cost-consuming and ii) often leads to nonconsensual results. The relevance and potential of automatic classification algorithms using hematoxylin-eosin stained histopathological images has already been demonstrated, but the reported results are still sub-optimal for clinical use. With the goal of advancing the state-of-the-art in automatic classification, the Grand Challenge on BreAst Cancer Histology images (BACH) was organized in conjunction with the 15th International Conference on Image Analysis and Recognition (ICIAR 2018). A large annotated dataset, composed of both microscopy and whole-slide images, was specifically compiled and made publicly available for the BACH challenge. Following a positive response from the scientific community, a total of 64 submissions, out of 677 registrations, effectively entered the competition. From the submitted algorithms it was possible to push forward the state-of-the-art in terms of accuracy (87%) in automatic classification of breast cancer with histopathological images. Convolutional neuronal networks were the most successful methodology in the BACH challenge. Detailed analysis of the collective results allowed the identification of remaining challenges in the field and recommendations for future developments. The BACH dataset remains publically available as to promote further improvements to the field of automatic classification in digital pathology.