I finally got around to writing another post after an unfortunate long time break.
This one will briefly summarize the papers I found most interesting during the last month or so. What I find interesting about this paper review, is that in writing it, I came across two different works, both interesting, that actually solve opposite tasks, as mentioned in the title.
Deep Learning Markov Random Field for Semantic Segmentation – This one introduces DPN (deep parsing network) for semantic segmentation. A single CNN is able to jointly infer and learn the unary (per pixel) and pairwise (inter-pixel) terms of MRF models. Easily parallelized and speeded up, since based on CNNs. Achieve state of the art in VOC12, Cityscapes and CamVid datasets
DropNeuron: Simplifying the structure of deep neural networks – A novel approach of optimizing a deep neural network through the regularization of network architecture. This is basically a mechanism for dropping neurons during training. It allows one to construct simpler deep nets with compatible performance, while a lot quicker and less heavy. Code provided, including examples.
Deep Learning Relevance: Creating Relevant Information (as Opposed to Retrieving it) – This work is about training an RNN that given a query, would synthesize a relevant document containing information on that query. On a user experiment, the synthetic document created by this approach was rated most relevant.
Learning Fine-Scaled Depth Maps from Single RGB Images – This work presents a multi-scale ConvNet with skip fusion layers for inferring a depth map from single RGB images. The depth maps are competitive with state of the art and also lead to accurate and rich with detail 3D reconstructions, as you can see on the rightmost example below:
Exploring the Depths of Recurrent Neural Networks with Stochastic Residual Learning – This is a technical report, but it’s so filled with innovations that I assume it’s only a temporary status. They introduce what they call a Res-ENN, an equivalent of ResNets (a most successful network structure that facilitates the training of ultra deep CNNs) for RNNs. In addition, to allow ultra-deep RNNs, they make use of two regularization techniques – one that drops layers randomly (stochastic depth), and one that drops timesteps (i.e words in a sentence) – namely stochastic timesteps. This new type of network with its corresponding regularization techniques is used for the sentiment classification task in NLP.
A Hierarchical Model for Text Autosummarization – A hierarchical LSTM encoder-decoder model is used for the task of text summarization with good results. Here’s an example:
Original document: official says number ’ of emails copyright 2015 cable news network/turner broadcasting system , inc. all rights reserved . this material may not be published , broadcast , rewritten , or redistributed . an email chain between former secretary of state hillary clinton and of u.s. central command david petraeus from january and february 2009 is raising questions about whether some of the emails on clinton ’s private email server are mistakenly deemed personal and not included among the 55,000 pages of emails she turned over to the state department .
Original title: new hillary clinton email chain discovered G: hillary clinton email service discovered
Generated summary: hillary clinton email service discovered
Pretty cool, right?
Deep Depth Super-Resolution : Learning Depth Super-Resolution using Deep Convolutional Neural Network – An end-to-end CNN is trained for mapping low resolution depth images to high resolution ones. State-of-the art performance is achieved on various benchmarks. The depth statistical information is used as a prior in the regularization of their neural network.
Fast Robust Monocular Depth Estimation for Obstacle Detection with Fully Convolutional Networks – The authors proposed a fully convolutional encoder-decoder based network to extract a depth map from an image + optical flow estimation computed by the Brox algorithm. In addition to real training data, simulated data using the Unreal engine is used. This approach is intended as a basis for an obstacle detection system.